in reply to Re^8: Out of memory problems
in thread Out of memory problems

What would you do if you had an occation where the data wasn't on cut (ie. the first pattern was offset by 10 bytes) and the user had no way of knowing?

This reads a two framesized chunk and uses a regex to discover the alignment of the first full frame within it. If the offset is non-zero, then it discards that many bytes from the front of buffer (and issues a warning to indicate that), tops up the buffer to two full frames thereby aligning the read pointer to the start of the 3rd frame. Processes and outputs the first two full frames and then processes the rest a frame at a time as before.

#! perl -sw use strict; use bytes; open IN, '< :raw', $ARGV[ 0 ] or die "$ARGV[ 0 ] : $!"; open OUT, '> :raw', $ARGV[ 1 ] or die "$ARGV[ 1 ] : $!"; ## Grab a double buffer load first time so we can check & correct alig +nment local $/ = \768; my $buf = <IN>; ## Read two frames worth ## Check alignment. Assumes the xf4 .191 xf4 is unique per frame? $buf =~ m[(\xF4.{191}\xF4)]; ## Record the offset to the first frame my $offset = $-[0]; ## If there was an offset to the first match if( $offset != 0 ) { ## Chop off the leading junk substr( $buf, 0, $offset, '' ); ## Top up the buffer to two full frames read( IN, $buf, $offset, 768 - $offset ); warn "$offset bytes discarded from front of file."; } ## Process the first two whole frames print OUT unpack 'x2 a190 x2 a58 x132' x 2, $buf ## Now process as before local $/ = \384; ## Read file in 384 byte chunks. while( <IN> ) { print OUT unpack 'x2 a190 x2 a58', $_; } close IN; close OUT;

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

Replies are listed 'Best First'.
Re^10: Out of memory problems
by tperdue (Sexton) on Oct 26, 2004 at 12:31 UTC
    Sorry for being a headache here but doing binary with perl is new to me. What if the file isn't byte aligned, meaning I need to work at a bit level?

      Then you're back to using unpack 'B*', ....

      However, there is still no need to read the whole file into memory. The same techniques used for the byte aligned would work just the same for bit aligned once you convert the stream to asci-ized binary. I would probably chain two processes together.

      1. Convert to ascii-ized binary, locate the offset and discard the junk using the same process as above.

        Having aligned the datastream, it would then convert back to properly byte aligned binary and write 384 bytes binary packets to STDOUT.

      2. The second process would just be the working byte aligned version above.

      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
      "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
        You help has been greatly appreciated. Thanks again.
Re^10: Out of memory problems
by tperdue (Sexton) on Oct 26, 2004 at 16:50 UTC
    Had an error when I ran the code. I'm getting a "Not enough arguments for read new $offset in the $buf .= read ( IN, $offset ); line. I modified it to $buf .= read ( IN, $buf, $offset) then I got an 'x' outside of string in unpack error for the print OUT unpack 'x2 a190 x2 a58 x132' x 2, $buf line. I replaced the x with * but get a numerical error. What am I doing wrong?

      I've corrected the code above.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "Think for yourself!" - Abigail
      "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon