ashish.kvarma has asked for the wisdom of the Perl Monks concerning the following question:
I was looking and arunshankar.c's post XML parsing and thought same could be done (slightly faster) using localized $INPUT_RECORD_SEPARATOR.
I assumed (don't know why) that it will faster using $INPUT_RECORD_SEPARATOR. To see how much faster it is I did a small benchmark, but was astonished at the results.
Below are the results and the benchmark code I used.
#!/usr/bin/perl use strict; use warnings; use Benchmark qw(cmpthese); my $count = -100; cmpthese($count, { 'Split' => sub { my $document; open(FILE, 'removed.xml') or die "Error [$!]\n"; while (<FILE>) { $document .= $_ } my @lines = split('\|',$document); }, 'IRS_while' => sub { local $/ = '|'; my @lines; open(FILE, 'removed.xml') or die "Error [$!]\n"; while (<FILE>) { chomp; push @lines, $_; } }, 'IRS_map' => sub { local $/ = '|'; open(FILE, 'removed.xml') or die "Error [$!]\n"; my @lines = map {chomp; $_} (<FILE>); }, });
Rate IRS_map IRS_while Split IRS_map 4936/s -- -7% -8% IRS_while 5303/s 7% -- -2% Split 5394/s 9% 2% --
I have run this multiple times with different values of $count, Spiting string seems to have slight advantage in all cases.
For a while I though I may be doing something wrong in the code, though at least I am not able to see if there is any issue with the code. I guess splitting is a bit faster (probably not significant, but it is what it is).
Can someone please help me to understand why is Split faster than using $INPUT_RECORD_SEPARATOR.
Thanks in advance.
P.S: Don't know if its important but just for information, I am using Active Perl 5.16 on Windows 7, 32 bit, Intel Core i3.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Benchmark results | localizing $INPUT_RECORD_SEPARATOR vs spliting contents of file on $INPUT_RECORD_SEPARATOR
by MidLifeXis (Monsignor) on Nov 06, 2012 at 13:32 UTC | |
by Anonymous Monk on Nov 06, 2012 at 14:24 UTC | |
by MidLifeXis (Monsignor) on Nov 06, 2012 at 14:32 UTC | |
|
Re: Benchmark results | localizing $INPUT_RECORD_SEPARATOR vs spliting contents of file on $INPUT_RECORD_SEPARATOR
by 2teez (Vicar) on Nov 06, 2012 at 12:42 UTC |