I haven't gone into great detail, but it appears the module incurs high overhead. Specifically, I ran the following benchmarks:
#!/usr/bin/perl
use strict;
use warnings;
use Perl6::Slurp;
use Benchmark qw(cmpthese :hireswallclock);
# write file with different line endings
# \r = 0x0D
# \n = 0x0A
my $win_line = "Windows\r\n";
my $unix_line = "Unix\n";
my $mac_line = "Mac\r";
my @strings = ();
for (1 .. 1000)
{
push @strings, $win_line;
push @strings, $unix_line;
push @strings, $mac_line;
push @strings, $win_line;
push @strings, $mac_line;
push @strings, $win_line;
push @strings, $unix_line;
}
open(my $fh, ">", "le_big.txt") or die "Failed file open: $!";
binmode($fh);
print $fh $_ foreach @strings;
close($fh);
cmpthese(1000, {
'naive' => \&naive,
'original' => \&original,
'local split' => \&local_split,
'crlf split' => \&crlf_split,
});
# Original code
sub original {
my @results = slurp("<:crlf", "le_big.txt", {irs => qr/\n|\r/,
+ chomp => 1});
return @results;
}
# Just use slurp and crlf to read the file
sub crlf_split {
my @initial_results = slurp("<:crlf", "le_big.txt");
my @results = map split(/\r/), @initial_results;
return @results;
}
# Just use slurp to read the file
sub local_split {
my @initial_results = slurp("<", "le_big.txt");
my @results = map split(/\n|\r\n?/), @initial_results;
return @results;
}
# Naive local implementation
sub naive {
open(my $fh, "<", "le_big.txt") or die "Failed file open: $!";
local $/;
my $slurp = <$fh>;
close $fh;
my @results = split /\n|\r\n?/, $slurp;
return @results;
}
With the following results:
time perl fluff.pl
Rate original local split crlf split naive
original 28.4/s -- -32% -43% -80%
local split 41.8/s 47% -- -16% -70%
crlf split 49.6/s 75% 19% -- -65%
naive 141/s 398% 238% 185% --
real 1m26.512s
user 1m26.450s
sys 0m0.060s
Run under perl v5.8.8 built for x86_64-linux-gnu-thread-multi, Ubuntu box. Note how much faster the quick-and-dirty slurp and split approach I wrote is. The moral, I think, is that you should only use this module if you have good reason. Note as well that I'm pretty sure the half-way solutions will drop empty lines from the result. |