Re: Needed Performance improvement in reading and fetching from a file

Let's optimize the split-funtion and see how fast we can get:

#!/usr/bin/perl
use strict;
use warnings;
use Benchmark qw(:all) ;

# Set up an array of data to test
my $data = '0906928472847292INR~UTRIR8709990166~     700000~INR~200806
+23~RC425484~IFSCSEND001                       ~Remiter Details ~10000
+07   ~TEST RTGS TRF7                     ~                           
+        ~                                   ~ ~RTGS~REVOSN OIL CORPOR
+ATION   ~IOCL  ~09065010889~0906501088900122INR~         7~         1
+~ 1';
my @data;
push @data, $data for 1 .. 1000;

cmpthese(1000, {
    'Full split'      => sub {my @refnos; for (@data){my @splitted = s
+plit(/~/,$_);   push(@refnos,$splitted[1]);}},
    'Limited split'   => sub {my @refnos; for (@data){my @splitted = s
+plit(/~/,$_,3); push(@refnos,$splitted[1]);}},
    'Optimized split' => sub {my @refnos; for (@data){push @refnos, (s
+plit(/~/,$_,3))[1];}},
    });
[download]

Results:

                  Rate      Full split   Limited split Optimized split
Full split      23.1/s              --            -82%            -93%
Limited split    127/s            452%              --            -63%
Optimized split  343/s           1387%            169%              --
[download]

Some things to remember:

Don't split into more fields than needed.
Avoid assigning to an array if you only need one field.
Even better, use split and an index into its reults as an argument in push

By using these simple rules, we get an almost 14 fold speed increase!

CountZero

A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Comment on Re: Needed Performance improvement in reading and fetching from a file Select or Download Code

Replies are listed 'Best First'.
Re^2: Needed Performance improvement in reading and fetching from a file by wfsp (Abbot) on Oct 08, 2008 at 07:49 UTC
I wondered how a plain regex would fare. Probably not enough in it to be significant. #!/usr/bin/perl use strict; use warnings; use Benchmark qw(:all) ; # Set up an array of data to test my $data = '0906928472847292INR~UTRIR8709990166~ 700000~INR~200806 +23~RC425484~IFSCSEND001 ~Remiter Details ~10000 +07 ~TEST RTGS TRF7 ~ + ~ ~ ~RTGS~REVOSN OIL CORPOR +ATION ~IOCL ~09065010889~0906501088900122INR~ 7~ 1 +~ 1'; my @data; push @data, $data for 1 .. 1000; cmpthese( 1000, { 'Full split' => sub {my @refnos; for (@data){my @splitted + = split(/~/,$_); push(@refnos,$splitted[1]);}}, 'Limited split' => sub {my @refnos; for (@data){my @splitted + = split(/~/,$_,3); push(@refnos,$splitted[1]);}}, 'Optimized split' => sub {my @refnos; for (@data){push @refnos +, (split(/~/,$_,3))[1];}}, 'regex' => sub {my @refnos; for (@data){push @refnos +, /~([^~]+)~/;}}, } ); [download] `Rate Full split Limited split Optimized split + regex Full split 23.3/s -- -81% -90% + -91% Limited split 120/s 413% -- -51% + -54% Optimized split 244/s 946% 104% -- + -7% regex 262/s 1024% 119% 7% + --` [download]	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^2: Needed Performance improvement in reading and fetching from a file
by wfsp (Abbot) on Oct 08, 2008 at 07:49 UTC

#!/usr/bin/perl
use strict;
use warnings;
use Benchmark qw(:all) ;

# Set up an array of data to test
my $data = '0906928472847292INR~UTRIR8709990166~     700000~INR~200806
+23~RC425484~IFSCSEND001                       ~Remiter Details ~10000
+07   ~TEST RTGS TRF7                     ~                           
+        ~                                   ~ ~RTGS~REVOSN OIL CORPOR
+ATION   ~IOCL  ~09065010889~0906501088900122INR~         7~         1
+~ 1';
my @data;
push @data, $data for 1 .. 1000;

cmpthese(
    1000, 
    {
        'Full split'      => sub {my @refnos; for (@data){my @splitted
+ = split(/~/,$_);   push(@refnos,$splitted[1]);}},
        'Limited split'   => sub {my @refnos; for (@data){my @splitted
+ = split(/~/,$_,3); push(@refnos,$splitted[1]);}},
        'Optimized split' => sub {my @refnos; for (@data){push @refnos
+, (split(/~/,$_,3))[1];}},
        'regex'           => sub {my @refnos; for (@data){push @refnos
+, /~([^~]+)~/;}},
    }
);
[download]

                  Rate    Full split Limited split Optimized split    
+     regex
Full split      23.3/s            --          -81%            -90%    
+      -91%
Limited split    120/s          413%            --            -51%    
+      -54%
Optimized split  244/s          946%          104%              --    
+       -7%
regex            262/s         1024%          119%              7%    
+        --
[download]

[reply]
[d/l]
[select]