in reply to Re: Pattern match not working sometimes
in thread Pattern match not working sometimes

"I think substr would be better for you case"

Actually, no. The OP stressed that he is dealing with octets, not characters. unpack could be a reasonable option, but probably not as clear as the regex unless the maintenance programmer is familiar with pack/unpack. The unpack code would (using the OP's sample data) look like:

my $stuff = '000010100101110001010110010010010100111100111111001001000 +1110110'; my $bytes = pack('B64', $stuff); my ($prefix, $tail) = unpack('a4a*', $bytes); print ">$prefix<\n>$tail<\n";
True laziness is hard work

Replies are listed 'Best First'.
Re^3: Pattern match not working sometimes
by bulk88 (Priest) on Mar 19, 2012 at 01:57 UTC
    The OP is not introducing unicode or mentioning his locale anywhere in his code. The scalars coming from the OP's socket will have byte semantics. Why would any scalars be upgraded to unicode in his code? OP claims his length() return is the number of bytes in $datagram. $datagram isnt utf marked. He didn't say he is using -C.

      unpack makes it fairly clear that the code is dealing with octet (byte) oriented data without need for any further context. substr implies string handling with the possibility for utf/other encoding confusion. It's not that substr is flat out wrong in the context, just that it doesn't send as clear a message as unpack or the use of \C in a regex.

      True laziness is hard work