in reply to Efficiently Extracting a Range of Lines (was: Range This)
How much do I get for speeding it up 600% ?
Looks like tadman beat me to the draw :-( Drat, missed it by that much chief.
When in doubt use Benckmark. Here is a regex solution that is 6 times faster.
use Benchmark; my $mine = <<'ME'; my $stuff = <<EOF; boring boring boring boring boring boring boring boring boring boring boring boring boring boring boring START My wonderful super duper yummy booty bag information END boring boring boring boring boring boring boring boring boring boring boring boring boring boring boring EOF my ($good_stuff) = $stuff =~ m/(START.*END)/s; # print $good_stuff; ME my $yours = <<'YOU'; my $stuff = <<EOF; boring boring boring boring boring boring boring boring boring boring boring boring boring boring boring START My wonderful super duper yummy booty bag information END boring boring boring boring boring boring boring boring boring boring boring boring boring boring boring EOF my @lines = split("\n", $stuff); my $good_stuff; foreach(@lines){ if(m/START/ .. m/END/) { $good_stuff .= "$_\n"; } } # print $good_stuff; YOU # prove they both work, uncomment the prints # (we don't want to print when benchmarking) # uncomment these evals # eval $mine; # eval $yours; # Benchmark those suckers timethese(100000,{'Mine' => $mine, 'Yours' => $yours}); C:\>perl test.pl Benchmark: timing 100000 iterations of Mine, Yours... Mine: 4 wallclock secs ( 3.90 usr + 0.00 sys = 3.90 CPU) @ 25 +641.03/s ( n=100000) Yours: 24 wallclock secs (23.73 usr + 0.00 sys = 23.73 CPU) @ 42 +14.08/s (n =100000) C:\>
So the regex solution is about 6 times faster due no doubt to the fact that it uses more C and less Perl. You can get a whole file into a scalar by setting the input record separator to undef like so $/ = undef; then using $scalar = <FILE>; so this is quite practical if you have the memory. Oh this is the best way to undef the input record sepatator to avoid nasty suprises elsewhere in your code
{ local $/; open FILE, "<$file"; $everything = <FILE>; close FILE; } # out here the input record separator is still normal
cheers
tachyon
s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print
|
|---|