in reply to Re^4: Working with fixed length files
in thread Working with fixed length files
You are benchmarking the code from the original nodes, which as I mentioned, operate on different assumptions.
Ike's assumption means the while loop only iterates half as many time as it does for mine. The differences you are measuring are down to that.
If you modify Ike's to read one record at a time and operate upon it conditionally (per my benchmark), or modify mine to read and map the pairs of records into a single pre-partitioned buffer thereby removing the need for the if statment in the loop, then you would be comparing like with like.
I also tweeked my benchmark code to a) use a fixed size read thereby avoiding the newline search; b) changed the condition of the loop so that I could assign the return from readline directly to the mapped buffer avoiding another copy.
This was to ensure that the differences being tested were down to the unpack .versus. substr refs, not the ancilliary details of code written to demonstate the technique, not performance.
For more performance, do away with the substr and read directly into the partitioned scalar:
#! perl -slw use strict; use Time::HiRes qw[ time ]; my $start = time; my $rec = chr(0) x 123; my @type3l = split ':', '02:10:33:15:19:10:3:18:6:4'; my $n = 0; my @type3o = map{ $n += $_; $n - $_; } @type3l; my @type3 = map \substr( $rec, $type3o[ $_ ], $type3l[ $_ ] ), 0 .. $# +type3o; my @typeOl = split ':', '02:98:11:9'; $n = 0; my @typeOo = map{ $n += $_; $n - $_; } @typeOl; my @typeO = map \substr( $rec, $typeOo[ $_ ], $typeOl[ $_ ] ), 0 .. $# +typeOo; until( eof() ) { read( ARGV, $rec, 123, 0 ); if( $rec =~ /^03/ ) { print join '/', map $$_, @type3; } else { print join '|', map $$_, @typeO; } } printf STDERR "Took %.3f for $. lines\n", time() - $start;
And for ultimate performance, switch to binmode & sysread to avoid Windows crlf layer overhead. But it requires other tweaks also and I'm 21 hours into this day already.
But whatever, you do need to be comparing like with like.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^6: Working with fixed length files
by Tux (Canon) on Apr 28, 2011 at 11:30 UTC | |
by BrowserUk (Patriarch) on Apr 28, 2011 at 19:33 UTC | |
by Tux (Canon) on Apr 29, 2011 at 09:46 UTC | |
by Tux (Canon) on Apr 29, 2011 at 11:37 UTC | |
by BrowserUk (Patriarch) on Apr 28, 2011 at 11:42 UTC | |
by Tux (Canon) on Apr 28, 2011 at 11:52 UTC | |
by BrowserUk (Patriarch) on Apr 28, 2011 at 18:56 UTC |