comment on

Well I used your code, with some modifications to produce:

sub read_rle_file {#read run length encoded file
    my $filespec=shift;
    my $numfields=shift;
    my $sub=shift;

    my $sha1=Digest::SHA1->new();
    my $IN_IO;
    print "Reading run length encoded file $filespec, with records of 
+$numfields fields.\n"
        if $Debug;
    if ( $filespec =~ /\.gz/ ) {
        $IN_IO = IO::Zlib->new( $filespec, "rb" )
            or die "Cannot open compressed run length encoded file $fi
+lespec : \n" ;
    } else {
        $IN_IO = IO::File->new($filespec)
            or die "Cannot open run length encoded file $filespec : \n
+" ;
        binmode $IN_IO;
    }

    my $buffer=""; # The buffer we are using
    my $records=0; # Number of record we have read
    my $buffers=0; # The number of times we have refilled the buffer
    my $bytes  =0; # The number of bytes we have read so far
    my $record=[]; # the array of records

    # read until the file is empty
    while (!$IN_IO->eof ) {

        my $read_buffer;
        my $bytesread = $IN_IO->read( $read_buffer, $Config{buffer_siz
+e} );
        die "Read error in read_rle_file($filespec,$numfields)\n"
            unless defined $bytesread;

        $bytes+=$bytesread;
        $sha1->add($read_buffer);
        $buffer.=$read_buffer;

        my @records;

        # try to extract as many records as possible from the buffer
        BUFFER:
        while ((my $len=ord($buffer)) < length $buffer) {
            push @$record,unpack("C/a",$buffer);
            substr($buffer,0,$len+1,"");
            if (@$record==$numfields) {
                push @records,$record;
                $record=[];
            }
        }
        # hand off to the callback the records we have extracted so fa
+r
        # we do this in chunks to save the callback overhead
        $sub->(\@records);
        $records+=@records;
        print "After buffer ".($buffers++)." read $records records fro
+m $bytes bytes.\n" if $Debug>1;
    }
    die "Unprocessed data in buffer! read_rle_file($filespec,$numfield
+s) failed!\n[@$record] $buffer\n"
        if @$record || length $buffer;
    return wantarray ? ($sha1->b64digest,$records) : $sha1->b64digest;
}
[download]

And Particles point is valid, but luckily in my situation im not worried about a corrupt file so much as an improperly terminated one. Also note the games with $record to handle when a buffer empties before a full record is complete, instead of pushing the incomplete record back into the buffer I now leave it in $record.

Anyway, thanks for the feedback.

Yves / DeMerphq
---
Software Engineering is Programming when you can't. -- E. W. Dijkstra (RIP)

In reply to Re: Re: Reading a run length encoded file in a buffering scenario by demerphq
in thread Reading a run length encoded file in a buffering scenario by demerphq

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.