BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to make use of the pack/unpack pattern 'C/a*' but am encountering several sorts of weirdness.

If I do

my $packed = pack 'C/a*', 'the quick brown fox'; print "'$packed'";

I get '?the quick brown fox'. The string, prefixed with a 1-byte length.

And when I do

print unpack('C/a*', $packed);

I get the quick brown fox. The original string minus the length prefix. Great!

However, when I do

my $data = unpack 'C/a*', $packed; print "'$data'";

I get 19 the length of the string? Ah, Context!. Scalar/versus list. So I tried:

my @data = unpack 'C/a*', $packed; print scalar @data, ':', join '|', @data;

and got 1:the quick brown fox. The array has one element, and its the string not then length?

So then I tried

my ($len, $val) = unpack 'C/a*', $packed; print "$len:$val;";

and got

Use of uninitialized value in concatenation (.) or string at C:\test\t +est.pl line 16. the quick brown fox:;

So, unpack with a template of 'C/a*' returns the length and not the string in a scalar context and a one element list consisting of just the string, but not the length in a list context.

So to get both pieces of information I have to call unpack twice?

my $len = unpack 'C/a*', $packed; my ($data) = unpack 'C/a*', $packed;

Or just do the latter and use length.

Conclusion: This is, at the very least, unintuative, non-DWIM, perverse and undocumented, and quite possibly a bug?

Worthy of a perlbug?


Examine what is said, not who speaks.

The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

Replies are listed 'Best First'.
Re: unpack 'C/a*' and context weirdness
by jmcnamara (Monsignor) on Feb 14, 2003 at 22:16 UTC

    This does seem a little strange and the behaviour seems to have changed between perl 5.6.1 and 5.8.0. So maybe it was a bug that was fixed.

    With perl 5.6.1, I get:

    $ perl -le 'print scalar unpack "C/a*", pack "C/a*", "Test"' 4
    With perl 5.8.0, I get:
    $ perl -le 'print scalar unpack "C/a*", pack "C/a*", "Test"' Test

    Update: The pack documentation for both versions state "The length-item is not returned explicitly from unpack.". So it looks like the 5.6.1 version is wrong.

    --
    John.

      Thanks for tracking that down. Thanks also to jsprat and robartes.

      Another reason to move up to 5.8 I guess. It's just the pain of re-installing everything from the 5.6.1 site/lib dir to the 5.8 dirs. Oh well, sooner I start, sooner its done.... tomorrow. I'll do it tomorrow:)


      Examine what is said, not who speaks.

      The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

        Use the wonder of the CPAN module to help you when upgrading, you can run it under 5.6.1 and use the 'autobundle' command to make a bundle file that describes all the modules you have installed. Then install 5.8, and use CPAN again to install the bundle, and it will download and install all the modules you had in 5.6.1.

Re: unpack 'C/a*' and context weirdness
by jsprat (Curate) on Feb 14, 2003 at 22:08 UTC
    The first byte of $packed is the length, and the rest is the string. Just as if you had initialized $packed like so:
    my $s = 'the quick brown fox'; $packed = pack 'Ca*', length $s, $s;

    To unpack it, leave out the '/'

    my ($len, $val) = unpack 'Ca*', $packed; print "$len:$value"; __END__ output: 19:the quick brown fox

    If it is a bug it's a documentation bug; the meaning is very unclear. Using the '/' is only documented in pack, not in unpack - but it behaves differently in unpack than pack. With 'C/a*' as the template, unpack returns that many bytes. IE, whatever 'C/' represents will be how long a string is returned. Change it to 5 when you unpack it and you'll just get "the q".

    With pack, it will prepend the length of the item packed.

    Update: I missed the point about scalar versus list context. That is weird - probably a bug! From the pack docs:

    The *length-item* is not returned explicitly from "unpack".
    In scalar context, the length-item is returned (using AS 5.6.1 build 633).
Re: unpack 'C/a*' and context weirdness
by robartes (Priest) on Feb 14, 2003 at 22:27 UTC
    This seems to be version specific, indicating that indeed, there was a bug / misfeature somewhere. This is what I get:
    #!/usr/local/bin/perl -w use strict; use Data::Dumper; my $string="the quick brown fox"; my $packed=pack 'C/a*', $string; my $unpacked= unpack 'C/a*', $packed; print "'$unpacked'\n"; my @data= unpack 'C/a*', $packed; print Dumper(\@data); my ($len,$val)=unpack 'C/a*', $packed; print "$len:$val\n"; __END__ 'the quick brown fox' $VAR1 = [ 'the quick brown fox' ]; Use of uninitialized value in concatenation (.) or string at ./packwei +rd.pl line 11. the quick brown fox:
    List context gets me just the text, as does scalar context. My perl is:
    Summary of my perl5 (revision 5.0 version 8 subversion 0) configuratio +n: Platform: osname=linux, osvers=2.4.18-bv1, archname=i686-linux uname='linux quasimod 2.4.18-bv1 #2 tue jul 2 16:22:51 cest 2002 i +686 unknown '

    CU
    Robartes-

Re: unpack 'C/a*' and context weirdness
by bart (Canon) on Feb 15, 2003 at 01:36 UTC
    So to get both pieces of information I have to call unpack twice?
    my $len = unpack 'C/a*', $packed; my ($data) = unpack 'C/a*', $packed;
    No... you might have missed that spot in the docs that say "back up a byte". I will use it to effectively unpack the same byte twice, with one unpack(), and one template.
    print join ":", unpack 'CXC/a*', $packed;
    resulting in
    19:the quick brown fox
    Another option, though less user-friendly, would be to use the '@' template, which you can use to reposition the current pointer — think of it as a form of seek().
    print join ":", unpack 'C@0C/a*', $packed;
    the "0" being the absolute position — here, the start of the string. Actually, it turns out that if you omit it, zero is used as a default.

    The user-unfriendlyness is in the fact that you need to know the exact absolute position to seek to — making it impractical to encorporate it in the middle of a larger template — and not a relative one. Well, that's what the "X" is for, if you want to go backwards, and "x", to go forward.

    print join ":", unpack 'C@C/a*X15a5x7a*', $packed;
    which results in:
    19:the quick brown fox:quick:fox

      Yep! I had missed that. The 'CXC/a*' version serves my purpose perfectly. Thanks.


      Examine what is said, not who speaks.

      The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

Re: unpack 'C/a*' and context weirdness
by diotalevi (Canon) on Feb 15, 2003 at 14:07 UTC

    You'll love this feature even less when you try the other pack formats - I recall that the single-byte length formats work and some of the double-byte formats fail.


    Seeking Green geeks in Minnesota