Re: Out of memory problem when copying contents of file into array (Benchmarks)

Here's the benchmarks on a 8.6M, 156896 line file

                      Rate TedYoung   OP's RandomWalk `tail -100` IO::All->backwards File::ReadBackwards
TedYoung            1.37/s       --   -14%       -39%        -99%               -99%               -100%
OP's                1.60/s      17%     --       -29%        -99%               -99%               -100%
RandomWalk          2.26/s      64%    41%         --        -99%               -99%               -100%
`tail -100`          157/s   11309%  9679%      6841%          --               -37%                -82%
IO::All->backwards   251/s   18145% 15538%     10999%         60%                 --                -71%
File::ReadBackwards  868/s   63105% 54075%     38349%        454%               246%                  --

Update: It's interesting to note that on a much smaller file (47Kb in my case), tail occassionally wins over File::ReadBackwards and IO::All but those three consistently outperform the others.

Update (again): Didn't look at the OP's code well enough when I made the benchmark. Fixed it so that it works comparitively to the other examples. Also removed the split() from the `tail -100` which (much to my surprise) isn't necessary. Neither of these things appear to have affected the speed comparisons in any measurable way...

The code I used to do the benchmark:

use strict;
use Benchmark qw(cmpthese);
use lib '/home/mason/devel/lib';
use IO::All;
use File::ReadBackwards;
my $f = "/usr/local/apache/logs/error_log";
cmpthese(
    -3,
    {
        'OP\'s' => sub {
            my $lineno = 100;
            open( FILE, "$f" ) or die "Can't find $f\n";
            my @lines = <FILE>;
            my $num = @lines;
            my @last;
            for ( ; $lineno > 0 ; $lineno-- ) {
               push(@last,$lines[ $num - $lineno ]);
            }
            close(FILE);
        },
        'IO::All->backwards' => sub {
            my $f = io($f)->backwards;
            my @lines = reverse map { $f->getline } 1 .. 100;
        },
        'File::ReadBackwards' => sub {
            my $f = File::ReadBackwards->new($f);
            my @lines = reverse map { $f->readline } 1 .. 100;
        },
        'TedYoung' => sub {
            my @lines;
            my $lines = 100;
            open F, $f or die $!;
            my @lines;
            while (<F>) {
                push @lines, $_;
                shift @lines if @lines > $lines;
            }
            close F;
        },
        '`tail -100`' => sub {
            my @lines = `tail -100 $f`;
        },
        'RandomWalk' => sub {
            my ($file, $lines) = ($f,100);
            open F, $file or die $!;
            my @lines;
            $#lines=($lines-1);
            my $i;
            while (<F>) {
            $lines[$i++] = $_;
            $i = 0 if $i == $lines;
            }
            close F;
            return @lines[$i--..$#lines, 0..$i]
        }
    }
);
[download]

Comment on Re: Out of memory problem when copying contents of file into array (Benchmarks) Download Code

Replies are listed 'Best First'.
Re^2: Out of memory problem when copying contents of file into array (Benchmarks) by Anonymous Monk on Feb 18, 2005 at 17:08 UTC
`'OP\'s' => sub { my $lineno = 100; open( FILE, "$f" ) or die "Can't find $f\n"; my @lines = <FILE>; my $num = @lines; for ( ; $lineno > 0 ; $lineno-- ) { my @tail = @lines[ $num - $lineno ]; } close(FILE); },` [download] This can't be correct. You never create an array with 100 lines. You do create 100 arrays with one line each, using a single element slice ('use warnings' complains). '`tail -100`' => sub { my @lines = split( /\n/, `tail -100 $f` ); }, [download] Why the explicite splitting? Why not just my @lines = `tail -100 $f`; [download] Not that it makes a huge difference speedwise.	[reply] [d/l] [select]
Re^3: Out of memory problem when copying contents of file into array (Benchmarks) by bpphillips (Friar) on Feb 18, 2005 at 17:39 UTC
I didn't write it... I just copied it from the OP's post. On even a cursory look, it's obviously broken but I didn't do that when I was setting up the Benchmark. My mistake.	[reply]