in reply to Needed Performance improvement in reading and fetching from a file

Let's optimize the split-funtion and see how fast we can get:
#!/usr/bin/perl use strict; use warnings; use Benchmark qw(:all) ; # Set up an array of data to test my $data = '0906928472847292INR~UTRIR8709990166~ 700000~INR~200806 +23~RC425484~IFSCSEND001 ~Remiter Details ~10000 +07 ~TEST RTGS TRF7 ~ + ~ ~ ~RTGS~REVOSN OIL CORPOR +ATION ~IOCL ~09065010889~0906501088900122INR~ 7~ 1 +~ 1'; my @data; push @data, $data for 1 .. 1000; cmpthese(1000, { 'Full split' => sub {my @refnos; for (@data){my @splitted = s +plit(/~/,$_); push(@refnos,$splitted[1]);}}, 'Limited split' => sub {my @refnos; for (@data){my @splitted = s +plit(/~/,$_,3); push(@refnos,$splitted[1]);}}, 'Optimized split' => sub {my @refnos; for (@data){push @refnos, (s +plit(/~/,$_,3))[1];}}, });

Results:

Rate Full split Limited split Optimized split Full split 23.1/s -- -82% -93% Limited split 127/s 452% -- -63% Optimized split 343/s 1387% 169% --

Some things to remember:

By using these simple rules, we get an almost 14 fold speed increase!

CountZero

A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Replies are listed 'Best First'.
Re^2: Needed Performance improvement in reading and fetching from a file
by wfsp (Abbot) on Oct 08, 2008 at 07:49 UTC
    I wondered how a plain regex would fare. Probably not enough in it to be significant.
    #!/usr/bin/perl use strict; use warnings; use Benchmark qw(:all) ; # Set up an array of data to test my $data = '0906928472847292INR~UTRIR8709990166~ 700000~INR~200806 +23~RC425484~IFSCSEND001 ~Remiter Details ~10000 +07 ~TEST RTGS TRF7 ~ + ~ ~ ~RTGS~REVOSN OIL CORPOR +ATION ~IOCL ~09065010889~0906501088900122INR~ 7~ 1 +~ 1'; my @data; push @data, $data for 1 .. 1000; cmpthese( 1000, { 'Full split' => sub {my @refnos; for (@data){my @splitted + = split(/~/,$_); push(@refnos,$splitted[1]);}}, 'Limited split' => sub {my @refnos; for (@data){my @splitted + = split(/~/,$_,3); push(@refnos,$splitted[1]);}}, 'Optimized split' => sub {my @refnos; for (@data){push @refnos +, (split(/~/,$_,3))[1];}}, 'regex' => sub {my @refnos; for (@data){push @refnos +, /~([^~]+)~/;}}, } );
    Rate Full split Limited split Optimized split + regex Full split 23.3/s -- -81% -90% + -91% Limited split 120/s 413% -- -51% + -54% Optimized split 244/s 946% 104% -- + -7% regex 262/s 1024% 119% 7% + --