Hello, kind monks--

I have a smallish perl script that parses/generates binary streams (a select subset of OpenPGP/RFC 4880 packets). It needs to run cleanly in both perl 5.8 and 5.10 environments for portability, and i've run into some confusion about the changes in pack and unpack between perl 5.8 and 5.10. In particular, i occasionally need to pack and unpack raw 8-bit values, (and to checksum them) and the unicode transitions have left me confused.

I've explicitly set use bytes;, but i'm not convinced that this is enough to ensure that i don't get screwed up results when run under unexpected locales or environments. I'm looking for guidance.

perldoc -f unpack references SYSV checksums in both versions, but 5.8 shows the algorithm as:

$checksum = do { local $/; # slurp! unpack("%32C*",<>) % 65535; };
while 5.10 shows it as:
$checksum = do { local $/; # slurp! unpack("%32W*",<>) % 65535; };
Is there a way to compute this portably without explicitly checking the version number of perl that is running? Can someone give me a concrete example of how it might break in 5.10 if i use "%32C*" instead of %32W*?

In a related note, when i'm un/packing literal bytes (but not checksumming), I'm currently using "C" -- should i be using something else? Do i need to be explicitly doing something to the incoming/outgoing data to force it to be treated as a binary blob instead of as a unicode string, even though i'm already declaring use bytes;?

In my research for this, i came across a post that leaves me worried about unexpected behavior from the rest of the archive, but i confess i don't understand the issues well enough to understand that post well enough to know what the Right Thing to do is for code that needs to be able to run correctly under both 5.8 and 5.10 and deals with raw binary data.

Any advice or pointers to specific reading would be most appreciated.


In reply to Understanding pack and unpack changes for binary data between 5.8 and 5.10 by dkg

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.