in reply to Can unpack add zero bytes before converting?

You can borrow missing (to count of 8) bytes from adjacent value and then throw them away with post-processing, applying a mask. Looks like it's faster to stay "numeric" at all stages if possible. The "unpack-pad-pack-unpack", (tybalt89's solution) is somewhere in the middle in terms of performance. Not that you said speed is the goal, but anyway:

use strict; use warnings; no warnings 'portable'; use Benchmark 'cmpthese'; my $bytes_per_value = 5; my $count = 1_000; my $value = 0xf_dead_beef_4; my $bin_value = substr (pack ('Q', $value), 0, $bytes_per_value); my $buffer = $bin_value x $count; my $fmt = sprintf "(b%d)*", $bytes_per_value << 3; my $pad_len = 8 - $bytes_per_value; my $mask = ~ 0 >> $pad_len * 8; cmpthese -1, { strings => sub { my @values = map { oct '0b'.reverse ($_)} unpack ($fmt, $buffer); return \@values; }, numbers => sub { my @values = unpack "(QX$pad_len)$count", $buffer . "\0" x $pad_len; $_ &= $mask for @values; return \@values; }, }; __END__ Rate strings numbers strings 2386/s -- -77% numbers 10274/s 331% --

Replies are listed 'Best First'.
How does backing up several bytes work with unpack? (a bug?) (was: Re^2: Can unpack add zero bytes before converting?)
by vr (Curate) on Sep 13, 2021 at 20:40 UTC

    I had to explicitly include a count in template: "(QX$pad_len)$count", instead of simply "(QX$pad_len)*", because otherwise @values array would result in a very puzzling length of 2501 items instead of 1000 i.e. $count.

    I suspect there's special code somewhere to prevent ((un?)documented(?) case of) endless loops with '(CX)*' or '(vX2)*' or similar, but still:

    say for unpack '(VX2)*', "\1\1\1\1"; # why 2 items? say for unpack '(VX3)*', "\1\1\1\1"; 'X' outside of string in unpack # how's that?

    and why 2501 items with '(QX3)*' template to begin with?