Re: Reading a huge input line in parts

You could probably gain quite a bit of speed by reading in chunks of the line instead of one character at a time. That way you can use the normal split function. Something like this, but with as large a buffer value as your system can handle well:

#!/usr/bin/env perl
use 5.010; use strict; use warnings;

my $l;               # chunk of a line
my $tiny_buffer = 8; # tiny buffer for testing

while( read DATA, $l, $tiny_buffer ){
    for (split ' ', $l){
        if( $_ eq '0' ){
            say 'Reached the end';
            exit;
        }
        say "; $_ ;";  # do stuff with the digit
    }
}

__DATA__
1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 0
[download]

Aaron B.
Available for small or large Perl jobs and *nix system administration; see my home node.

Comment on Re: Reading a huge input line in parts Download Code

Replies are listed 'Best First'.
Re^2: Reading a huge input line in parts by kroach (Pilgrim) on May 04, 2015 at 15:53 UTC
I thought about using read, however, since the numbers are not constant length, a single number could be split between two chunks. This would introduce additional complexity to detect and merge such split numbers. I should've included such examples in the sample input from the start, I've updated the question.	[reply]
Re^3: Reading a huge input line in parts by Laurent_R (Canon) on May 04, 2015 at 18:05 UTC
It should not be to costly in terms of resources and performance to check if you have a space at the beginning and at the end of each chunk of data before splitting it, and reconstruct the boundary numbers accordingly, especially if your read data chunks are relatively large. Je suis Charlie.	[reply]
Re^3: Reading a huge input line in parts by aaron_baugher (Curate) on May 04, 2015 at 22:36 UTC
In that case, I'd check the end of the buffer for digits, and if there are any, trim them off and save them to prepend to the next buffer that you read in. But you don't want to do that if it's the final 0 in the file, so I have some if statements in here. There's probably a more elegant way to do some of this, but I think this will handle it correctly: #!/usr/bin/env perl use 5.010; use strict; use warnings; my $l; # chunk of a line my $tiny_buffer = 8; # tiny buffer for testing my $leftover = ''; # leftover, possibly partial number at end of buf +fer while ( read DATA, $l, $tiny_buffer ) { $l = $leftover . $l; say " ;$l;"; $leftover = ''; if( $l =~ s/(\d+)$//g ){ if( $1 == 0 ){ $l .= '0'; $leftover = ''; } else { $leftover = $1; } } for (split ' ', $l) { if ( $_ == 0 ) { say 'Reached a zero'; } else { say "; $_ ;"; # process a number } } } __DATA__ 1 2 3 4 5 6 7 8 99 1 2 3 4 5 6 7 8 9 0 1 22 3 4 5 6 7 8 99 1 2 3 4 5 6 77 8 9 0 [download] Aaron B. Available for small or large Perl jobs and *nix system administration; see my home node.	[reply] [d/l]