in reply to Re^5: pack/unpack binary editing
in thread pack/unpack binary editing

Sorry for not getting back sooner. I was out of the office and was doing everything from memory. Here is the code I'm using and for a 1 gig file this takes a few hours. I forgot that I have to change every 2 bits to a corresponding 2 bit sequence. I'm also only looking for the first sync pattern once. I was hoping to do this without unpacking and later packing the data. That's the biggest drawback.
die "USAGE: $0 input\n" if scalar(@ARGV) < 1; $| = 1; $/ = \960; open IN, "$ARGV[0]" or die "$ARGV[0] : $!"; open OUT, ">tmp2"; my $y = 0; my $value = ''; while (<IN>) { my $bits = unpack("b*", $_); @array = split(//, $bits); foreach $value (@array) { $y++; $tmp = $tmp . $value; if ($y == 2) { if ($tmp =~ /00/) { $tmp = '11'; } elsif ($tmp =~ /11/) { $tmp = '00'; } print OUT $tmp; $y = 0; $tmp = ''; } } } close IN; close OUT; $/ = \7680; my $x = 0; #I LEFT THE OUTPUT AS ASCII-IZED BINARY AT THIS POINT #THUS THE LARGE INCREASE IN FILE SIZE open IN, "$tmp2"; open OUT, >$tmp3"; while (<IN>) { $_ =~ s/^.*(11111011000010001111011100010000)/$1/ if $x == 0; $x = 1; print OUT pack("b*", $_); } close IN; close OUT;

10-02-2005 Janitored by Arunbear - added code tags, as per Monastery guidelines

Replies are listed 'Best First'.
Re^7: pack/unpack binary editing
by BrowserUk (Patriarch) on Feb 10, 2005 at 14:10 UTC
    Here is the code I'm using and for a 1 gig file this takes a few hours.

    Pardon me for saying so, but I am not surprised.

    Not only are you unpacking the data to asciized binary, you then go on to split that into an array of digits.

    Then use a loop to go through that array one byte at a time looking for pairs of bytes that match '00' so that you can replace them with '11', or '11' and replace those with '00'.

    All this could be done with a couple of regex.

    Replacing this:

    while (<IN>) { my $bits = unpack("b*", $_); @array = split(//, $bits); foreach $value (@array) { $y++; $tmp = $tmp . $value; if ($y == 2) { if ($tmp =~ /00/) { $tmp = '11'; } elsif ($tmp =~ /11/) { $tmp = '00'; } print OUT $tmp; $y = 0; $tmp = ''; } } }

    with this:

    while (<IN>) { my $bits = unpack("b*", $_); $bits =~ s[(00|11)][ $1 eq '00' ? '11' : '00']ge; print OUT $bits; }

    should do (untested) the same thing and will run very much more quickly.

    Do I understand the logic of this code correctly?

    $/ = \7680; my $x = 0; #I LEFT THE OUTPUT AS ASCII-IZED BINARY AT THIS POINT #THUS THE LARGE +INCREASE IN FILE SIZE open IN, "$tmp2"; open OUT, >$tmp3"; while (<IN>) { $_ =~ s/^.*(11111011000010001111011100010000)/$1/ if $x == 0; $x = 1; print OUT pack("b*", $_); } close IN; close OUT;

    You are checking the first record only for the first occurance of the sync pattern, and then discarding anything that preceeds it?

    Ie. If the first record contains a partial frame, then throw it away and so sync the rest of the file?

    If so, then the following code should be a complete replacement and run in a fraction of the time. The output file "tmp2" will be the final file you are after without creating the 9 GB intermediate.

    Let me know if it works please. Also how long it takes. There are other thing that could be code to speed this up I think, but if the new runtime is acceptable, they may not be worth the extra effort.

    die "USAGE: $0 input\n" if scalar(@ARGV) < 1; $| = 1; $/ = \960; open IN, "$ARGV0" or die "$ARGV0 : $!"; open OUT, ">tmp2"; my $y = 0; my $value = ''; while (<IN>) { my $bits = unpack("b*", $_); ## Replace '00' with '11' and vice versa $bits =~ s[(00|11)][ $1 eq '00' ? '11' : '00']ge; ## Discard any partial fraem from the front of the file. $bits =~ s/^(.*)(?=11111011000010001111011100010000)// if $. == 1; print OUT pack 'b*', $bits; } close IN; close OUT;

    Examine what is said, not who speaks.
    Silence betokens consent.
    Love the truth but pardon error.
      I will give this a shot. What I really need to do with the sync pattern is find every occurance and extract it along with the following 476 bytes discarding everything after the 476th byte up to the next sync.

        In that case, the code in Re^5: pack/unpack binary editing should be easily adaptable to your purpose. You'll need to understand how it works, but the code from the previous post can be combined with it to do everything, including the syncing in a single pass.


        Examine what is said, not who speaks.
        Silence betokens consent.
        Love the truth but pardon error.