in reply to Re: How do I safely, portably extract one or more bytes from a string?
in thread How do I safely, portably extract one or more bytes from a string?

That won't work. If the string contains byte sequences that look like unicode characters, then reading 1 character will return multiple bytes, just as it would if you were reading from a unicode file.

I tried almost the exactly the same code as AnonyMonk, but got different results...leastwise I did last night! Today, I'm getting different results? I guess I just saw what I was expecting to see:(


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Hooray!
Wanted!

  • Comment on Re: Re: How do I safely, portably extract one or more bytes from a string?

Replies are listed 'Best First'.
Re: Re: Re: How do I safely, portably extract one or more bytes from a string?
by Anonymous Monk on Nov 29, 2003 at 06:24 UTC

    I'm pretty sure that using a reference to an integer as the record separator is strictly a byte oriented operation. At least the following still reads one byte at a time (though length reports 1 character as expected):

    my $string = chr(400); print length($string),"\n"; local $/=\1; open FH, "<", \$string or die $!; while (my $byte = <FH> ) { print "<$byte>\n"; } close FH;

      That's almost exactly identical code that I tried last night before posting, but I apparently saw what I wanted to see:(

      Seems like an error or ommision in the unicode support, but .... davido++ got it right.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
      Hooray!
      Wanted!

Re: Re: Re: How do I safely, portably extract one or more bytes from a string?
by davido (Cardinal) on Nov 29, 2003 at 04:46 UTC
    I suspected that might be the case, but couldn't find the relevant documentation. perlvar states that setting $/ to a reference to an integer will cause file reads to read in no more than that number of bytes per iteration.

    I've re-scanned over: perlopentut, perllocale, perlport, perlunicode, and perluniintro. I know you're probably right, and that it's probably in there somewhere.

    So I guess what I'm saying is, which POD have I missed that discusses the effects of locales on the behavior of local $/ = \$integer; ?


    Dave


    "If I had my life to live over again, I'd be a plumber." -- Albert Einstein

      Sorry davido, I'm wrong and you are right.

      Despite my attempt to verify this before responding to your post, it seems I saw what I wanted to see instead of what was really there:(

      Personally, I think that this is an error, in that it means that a file containing 'fixed-length unicode names', will have variable length records, but that probably sounds like sour-grapes, so I'll shut up now:)


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
      Hooray!
      Wanted!