in reply to How do I safely, portably extract one or more bytes from a string?

This may not work as portably as you want, but if it does, it's definately the obscure way:

use strict; use warnings; require 5.8.0; my $string = "......bytes...."; local $/=\1; open FH, "<", \$string or die $!; while (my $byte = <FH> ) { #do your stuff with $byte } close FH;

It relies on the fact that setting the $/ input separator to a numeric value reads in that number of bytes. It also relies on the Perl 5.8.0 or later "In-memory file" open, where you can essentially open a scalar instead of a file. You then read the scalar in byte by byte.

I wouldn't recommend it for much, but it's an interesting exercise.

Update: Thanks Anonymous Monk for catching the glitch. I knew I was forgetting something. My original code read: local $/=1;. I've now corrected my snippet.

Update 2: After some testing and re-reading the appropriate documentation, it appears that this method will work, as long as you're using it on Perl 5.8.0 or later.


Dave


"If I had my life to live over again, I'd be a plumber." -- Albert Einstein

Replies are listed 'Best First'.
Re: Re: How do I safely, portably extract one or more bytes from a string?
by Anonymous Monk on Nov 29, 2003 at 02:54 UTC
    It relies on the fact that setting the $/ input separator to a numeric value reads in that number of bytes.

    You need to set $/ to a reference to a number: $/=\1;. The example you gave sets the record separator to "1", which isn't quite the same thing :-)

Re: Re: How do I safely, portably extract one or more bytes from a string?
by BrowserUk (Patriarch) on Nov 29, 2003 at 04:10 UTC

    That won't work. If the string contains byte sequences that look like unicode characters, then reading 1 character will return multiple bytes, just as it would if you were reading from a unicode file.

    I tried almost the exactly the same code as AnonyMonk, but got different results...leastwise I did last night! Today, I'm getting different results? I guess I just saw what I was expecting to see:(


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    Hooray!
    Wanted!

      I'm pretty sure that using a reference to an integer as the record separator is strictly a byte oriented operation. At least the following still reads one byte at a time (though length reports 1 character as expected):

      my $string = chr(400); print length($string),"\n"; local $/=\1; open FH, "<", \$string or die $!; while (my $byte = <FH> ) { print "<$byte>\n"; } close FH;

        That's almost exactly identical code that I tried last night before posting, but I apparently saw what I wanted to see:(

        Seems like an error or ommision in the unicode support, but .... davido++ got it right.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
        Hooray!
        Wanted!

      I suspected that might be the case, but couldn't find the relevant documentation. perlvar states that setting $/ to a reference to an integer will cause file reads to read in no more than that number of bytes per iteration.

      I've re-scanned over: perlopentut, perllocale, perlport, perlunicode, and perluniintro. I know you're probably right, and that it's probably in there somewhere.

      So I guess what I'm saying is, which POD have I missed that discusses the effects of locales on the behavior of local $/ = \$integer; ?


      Dave


      "If I had my life to live over again, I'd be a plumber." -- Albert Einstein

        Sorry davido, I'm wrong and you are right.

        Despite my attempt to verify this before responding to your post, it seems I saw what I wanted to see instead of what was really there:(

        Personally, I think that this is an error, in that it means that a file containing 'fixed-length unicode names', will have variable length records, but that probably sounds like sour-grapes, so I'll shut up now:)


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
        Hooray!
        Wanted!