Re: Advice needed on chunked read + byte-range logic

My current code:

    my $chunk_size = $cnf->{chunk_size} || 1024;
    my $bytes_in = $cnf->{bytes_in} || 0;
    my $bytes_out= $cnf->{bytes_out} || undef;

    my $pos = $bytes_in;
    seek($fh,$bytes_in,0);
    while ( read( $fh, my $buffer, $chunk_size ) ) {
        $pos += $chunk_size; # alt.: += length($buffer)

        if( defined($bytes_out) && $pos > $bytes_out){
              print bytes::substr($buffer, 0, ($chunk_size - ($pos - $
+bytes_out)) );  # make last chunk shorter
              last;
        }elsif( defined($bytes_out) && $pos == $bytes_out){
              print $buffer;
              last;
        }else{
              print $buffer;
        }
   }
[download]

bugs fixed:
- $pos summing was wrong
- differentiate between > and == and decide if we *really* need another susbtr() or if a simple print() will do

Comment on Re: Advice needed on chunked read + byte-range logic Download Code

Replies are listed 'Best First'.
Re^2: Advice needed on chunked read + byte-range logic by ikegami (Patriarch) on Aug 26, 2010 at 01:38 UTC
Why `bytes::substr`? Do you expect `read` to return something other than bytes? If so, you should make it so it doesn't (by using `binmode`) rather than attempting to work around it (by using `bytes::substr`). The following leaves the file pointer at the same spot as your code: `seek($fh, $bytes_in, SEEK_SET) or die($!); my $to_read = $bytes_out-$bytes_in; while ($to_read > 0) { my $rv = read($fh, my $buffer, $chunk_size); die($!) if !defined($rv); die("Premature eof") if !$rv; substr($buffer, $to_read) = '' if $rv > $to_read; print($buffer); $to_read -= $bytes; }` [download]	[reply] [d/l] [select]
Re^3: Advice needed on chunked read + byte-range logic by isync (Hermit) on Aug 27, 2010 at 19:05 UTC
I did so because I thought the read data might be a text file, e.g. non-utf8 ascii or so. As substr() by default operates in terms of characters, I wanted to prevent it "falling back" into character mode and always return byte offsets. I didn't expect that a "this is bin data" $string information would remain intact over $fh declared as binmode() -> read() into buffer ->substr() operation on this string... Further I thought binmode() is more a Win32 thing and as my code won't ever hit the MS world, I seldomly use it. As you refer to it, I think I should go back to using it. Just for completeness, my former code updated: my $chunk_size = $cnf->{chunk_size} \|\| 1024; my $bytes_in = $cnf->{bytes_in} \|\| 0; my $bytes_out= $cnf->{bytes_out} \|\| undef; my $pos = $bytes_in; binmode($fh); seek($fh,$bytes_in,0); while ( read( $fh, my $buffer, $chunk_size ) ) { $pos += $chunk_size; # alt.: += length($buffer) if( defined($bytes_out) && $pos > $bytes_out){ print substr($buffer, 0, ($chunk_size - ($pos - $bytes_o +ut)) ); # make last chunk shorter last; }elsif( defined($bytes_out) && $pos == $bytes_out){ print $buffer; last; }else{ print $buffer; } } [download]	[reply] [d/l]
Re^4: Advice needed on chunked read + byte-range logic by ikegami (Patriarch) on Aug 27, 2010 at 22:20 UTC
I wanted to prevent it "falling back" into character mode and always* return byte offsets.* By using bytes, you do exactly the opposite. `require bytes; $x = "\xC9\xCA\xCB\xCC"; utf8::downgrade($x); print(substr($x,1,1) eq "\xCA" ?1:0,"\n"); # 1 utf8::upgrade($x); print(substr($x,1,1) eq "\xCA" ?1:0,"\n"); # 1 utf8::downgrade($x); print(bytes::substr($x,1,1) eq "\xCA" ?1:0,"\n"); # 1 utf8::upgrade($x); print(bytes::substr($x,1,1) eq "\xCA" ?1:0,"\n"); # 0` [download] bytes gives access to the internal storage format of the string. It has nothing to do with whether the string only contains bytes or not. `bytes::substr` will probably do what you want. `substr` definitely will. Further I thought binmode() is more a Win32 thing and as my code won't ever hit the MS world, I seldomly use it. Prevents CRLF translations when the :crlf layer is used. Normally just on Windows. Removes :encoding layers to prevent decoding. Shouldn't be there, but you're the one who's worried. Just for completeness, my former code updated: I don't know why you're asking for help (or why I'm giving it) if you sticking with that complex, buggy code when the simpler solution even does error checking. Despite prompting, you never indicated whether it matters where you leave the file pointer when you're done. Does it?	[reply] [d/l] [select]