Re: perplexing inconsistency using RC4 and unpack

Replies are listed 'Best First'.
Re^2: perplexing inconsistency using RC4 and unpack by oddmedley (Novice) on Aug 01, 2009 at 03:27 UTC
difficult to show a small sample really.. on further investigation it seems unpack is really not doing what I'd expect it to: the message substr: 637573746F6D65726E616D6528BF4E5E4E758A4164004E56FFFA01082E2E00B6 is correctly unchanged all the way until we unpack it: `my $ms = $ml < $MAX_CHUNK_SIZE ? $ml : $MAX_CHUNK_SIZE; for my $piece ( 0..$num_pieces - 1 ) { my $ss = substr($message, $piece * $MAX_CHUNK_SIZE, $ms); # fine + up to here... my $ssl = length $ss; my @message = unpack( "C*", $ss ); ### at this point the charact +ers are changed... we go from 32 to 37 characters (in the array). why +? # i've changed the code in the RC4 package here slightly to allow +testing. # it produces identical results in all cases considered here.` [download] then it becomes (pack'ing it again and converting to hex for display): 637573746F6D65726E616D6528C2BF4E5E4E75C28A4164004E56C3BFC3BA01082E2E00C2B6 looking at these 2 strings you can see some characters are being inserted: 637573746F6D65726E616D6528 BF4E5E4E75 8A4164004E56 FFF A0108 2E2E00B6 #correct 637573746F6D65726E616D6528 C2 BF4E5E4E75 C2 8A4164004E56 C3BFC3B A0108 2E2E00 C2 B6 #faulty so unpack is somehow inserting those C2 (Ā in ascii) characters and changing the FFF (12 bits) to C3BFC3B what could be making unpack behave in this way?	[reply] [d/l]
Re^3: perplexing inconsistency using RC4 and unpack (UTF-8) by tye (Sage) on Aug 01, 2009 at 05:44 UTC
You are suffering from UTF-8 expansion: `#!/usr/bin/perl -wl print unpack "U0H", "\x{BF}"; print unpack "U0H", "\x{FF}"; print unpack "U0H*", "\x{FA}"; __END__ c2bf c3bf c3ba` [download] Now you just need to figure out where the UTF-8 expansion is sneaking in. - tye	[reply] [d/l]
Re^4: perplexing inconsistency using RC4 and unpack (UTF-8) by oddmedley (Novice) on Aug 01, 2009 at 14:44 UTC
thank you tye, that was extremely helpful. seems I had an issue with a blank string being converted to a hash somewhere in an XML parsing phase, which was later cast to a string (which seemed to make perl guess it was a UTF8 one). several concatenations later what looked like a nice plan ascii string, actually wasn't. the solution? specify my input more rigorously: `if($options->{'customerpass'} && (ref $options->{'customerpass'} ne +"HASH")){ $options->{'customerpass'} = encode("iso-8859-1", $options->{'cust +omerpass'}); }else{ $options->{'customerpass'} = ''; }` [download] leaves nothing to guess work and fixed my problem. (this call to encode ( use Encode; ) basically says (I think), 'whatever it looks like, this string is latin1 ascii, end of story.' no more utf8 expansion!) thanks for the help!	[reply] [d/l]
Re^3: perplexing inconsistency using RC4 and unpack by Anonymous Monk on Aug 01, 2009 at 03:37 UTC
I don't really follow , but why not try unpack "H" without "C" step?	[reply]