Veltro has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I am playing around with my NAS and have this qpkg package that I want to split on my Windows machine. The qpkg package is just too much to post here but roughly it consists of:

...sh script... something.tar.gzNULNULNUL...bunch tarball here....7z¼¯' ...bunch of +7z compressed material here...NULNULBS=MIME-Version: 1.0 ... base64 e +ncoded material here...

The sh script has lines:

script_len=3265 /bin/dd if="${0}" bs=$script_len skip=1 | /bin/tar -xO | /bin/tar -xzv + -C $_EXTRACT_DIR || exit 1 offset=$(/usr/bin/expr $script_len + 20480) /bin/dd if="${0}" bs=$offset skip=1 | /bin/cat | /bin/dd bs=1024 count +=26 of=$_EXTRACT_DIR/data.tar.7z || exit 1 [ -f /usr/local/bin/python ] && /usr/local/bin/python -c "with open('$ +_EXTRACT_DIR/data.tar.7z', 'rw+') as f: f.seek(26585); f.truncate()" offset=$(/usr/bin/expr $offset + 26585)

From that I calculated the following start positions:

0, 3265, 3265+20480, 3265+20480+26585

So I wrote a quick script to extract this stuff, but now I notice substr giving me trouble, most of the time it seems to output too many characters. But the start positions seems to be ok. I don't understand why. Any ideas?

#!/usr/bin/perl use strict ; use warnings ; use File::Basename ; # binmode(STDOUT, ":utf8"); my $fn = $ARGV[0] ; ( my $fns, my $fp, my $fs ) = fileparse( $fn, qr/\.[^.]*/ ) ; # print "fns = $fns, fp = $fp, fs = $fs\n" ; my $fc ; { local $/ ; # slurp # ? '<:encoding(utf8)' # open (my $fh, '<:encoding(utf8)', $fn ) or die "Failed to open f +ile $fn with error: $!\n" ; open (my $fh, '<', $fn ) or die "Failed to open file $fn with erro +r: $!\n" ; binmode $fh ; # Edit suggestion Anonymous Monk on Feb 14, 2021 at +21:31 UTC $fc = <$fh> ; close $fh ; } my $number = 'x' ; my @numbers ; while ( $number ne '' ) { print "Type number where to split: " ; $number = <STDIN> ; chomp $number ; push( @numbers, $number ) unless $number eq '' ; } my $start = 0 ; for my $i (0 .. $#numbers) { print "$i: $numbers[$i]\n" ; my $fn2 = $fp . $fns . "." . ($i+1) ; open (my $fh, ">", $fn2 ) or die "Failed to open file $fn2\n" ; binmode $fh ; # Edit suggestion Anonymous Monk on Feb 14, 2021 at +21:31 UTC print $fh substr($fc, $start, $numbers[$i]) ; $start = $numbers[$i] ; close $fh ; }

I have tried to set file input and stdout to utf8 (commented lines) but that makes things worse

Replies are listed 'Best First'.
Re: Mixed script and binary problems to extract with susbstr
by AnomalousMonk (Archbishop) on Feb 15, 2021 at 00:21 UTC
    print $fh substr($fc, $start, $numbers[$i]) ;

    The @numbers array seems to be a sequence of start positions (offsets), but substr has the syntax
        substr EXPR,OFFSET,LENGTH
    so perhaps
        print $fh substr($fc, $start, $numbers[$i]) ;
    needs to be (untested)
        print $fh substr($fc, $start, $numbers[$i] - $start) ;
    in the posted loop code.

    (This assumes you've got your file read-writes properly binmode-ed now.)

    Update: Minor changes (update: and a link added) for clarity.


    Give a man a fish:  <%-{-{-{-<

      Thank you (and thanks to the Anonymous Monk as well). I was so focused on the binary that I did not think of checking the arguments of substr. It works fine now.

Re: Mixed script and binary problems to extract with susbstr
by Anonymous Monk on Feb 14, 2021 at 21:31 UTC
    for binary files on Windows binmode the filehandle (NOT utf8)