in reply to Re^2: Seeking with 'x' in unpack and out of bounds reads
in thread Seeking with 'x' in unpack and out of bounds reads

You could use the pattern "x4 (NN X8 N x4 /a N)*". Instead of skipping the length to get it later, you would fetch it twice and use it once. And at least in that case, the pattern doesn't fail on a x (actually, since the x4 in the parentheses skips the bytes read by the second N, you know that there is something to skip.

Though actually, the fact that this fails is a good thing, because you know here that for some reason, after the last chunk, there are still some bytes (between 1 and 3) that lets x4 skip at least once, but not four times in a row. IE, your data is invalid. Try unpack "H*", pack "H*", <DATA>;. It looks like pack isn't very smart with the \n at the end of the string.

Replies are listed 'Best First'.
Re^4: Seeking with 'x' in unpack and out of bounds reads
by vr (Curate) on Apr 27, 2018 at 15:36 UTC

    If rogue chunk is e.g. 7 bytes long, then unpacking with the proposed template will die on "X", so wrapping into eval is required anyway if data are unreliable.

    However, 'x' doesn't have a direct effect on the output, so a partial match involving 'x' must be communicated through an error.

    Looks to me like an attempt to whitewash inconsistent Perl's behaviour :-), By similar reasoning, failure to unpack e.g. Pascal strings (as "unpack 'C/a', qq(\03ab)") should be fatal, I think.

    Side-note: PNG tags were made human-readable for a good reason, so perhaps "A4" instead of "N" (or "L") will serve better. E.g., if data are super-reliable (CRC sums to be ignored), then chunks can be read into a hash:

    my ( $head, %chunks ) = unpack 'a8 (x4 A4 X8 N x4 /a x4)*', $input; say for keys %chunks;

      Yes you're right. About everything ;). I should maybe have made it clearer that I was stating my understanding of x's behaviour, rather than an absolute truth :).

      For the X8 error, @0 might work better. But at that point, if you can't at least rely on the data to have the correct length, it might be easier to just read chunk by chunk, maybe even dividing each chunk into several unpacks.