Here is the code I'm using and for a 1 gig file this takes a few hours.

Pardon me for saying so, but I am not surprised.

Not only are you unpacking the data to asciized binary, you then go on to split that into an array of digits.

Then use a loop to go through that array one byte at a time looking for pairs of bytes that match '00' so that you can replace them with '11', or '11' and replace those with '00'.

All this could be done with a couple of regex.

Replacing this:

while (<IN>) { my $bits = unpack("b*", $_); @array = split(//, $bits); foreach $value (@array) { $y++; $tmp = $tmp . $value; if ($y == 2) { if ($tmp =~ /00/) { $tmp = '11'; } elsif ($tmp =~ /11/) { $tmp = '00'; } print OUT $tmp; $y = 0; $tmp = ''; } } }

with this:

while (<IN>) { my $bits = unpack("b*", $_); $bits =~ s[(00|11)][ $1 eq '00' ? '11' : '00']ge; print OUT $bits; }

should do (untested) the same thing and will run very much more quickly.

Do I understand the logic of this code correctly?

$/ = \7680; my $x = 0; #I LEFT THE OUTPUT AS ASCII-IZED BINARY AT THIS POINT #THUS THE LARGE +INCREASE IN FILE SIZE open IN, "$tmp2"; open OUT, >$tmp3"; while (<IN>) { $_ =~ s/^.*(11111011000010001111011100010000)/$1/ if $x == 0; $x = 1; print OUT pack("b*", $_); } close IN; close OUT;

You are checking the first record only for the first occurance of the sync pattern, and then discarding anything that preceeds it?

Ie. If the first record contains a partial frame, then throw it away and so sync the rest of the file?

If so, then the following code should be a complete replacement and run in a fraction of the time. The output file "tmp2" will be the final file you are after without creating the 9 GB intermediate.

Let me know if it works please. Also how long it takes. There are other thing that could be code to speed this up I think, but if the new runtime is acceptable, they may not be worth the extra effort.

die "USAGE: $0 input\n" if scalar(@ARGV) < 1; $| = 1; $/ = \960; open IN, "$ARGV0" or die "$ARGV0 : $!"; open OUT, ">tmp2"; my $y = 0; my $value = ''; while (<IN>) { my $bits = unpack("b*", $_); ## Replace '00' with '11' and vice versa $bits =~ s[(00|11)][ $1 eq '00' ? '11' : '00']ge; ## Discard any partial fraem from the front of the file. $bits =~ s/^(.*)(?=11111011000010001111011100010000)// if $. == 1; print OUT pack 'b*', $bits; } close IN; close OUT;

Examine what is said, not who speaks.
Silence betokens consent.
Love the truth but pardon error.

In reply to Re^7: pack/unpack binary editing by BrowserUk
in thread pack/unpack binary editing by tperdue

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.