Re^2: Hash searching

Most of the difference between limit_undef and nolim_undef actually comes from the fact that the undef in the list makes perl split an additional time just to throw the extra field away. Because if the list on the left side has a fixed length, split will be called with an implicit limit of length+1. So my ($id, undef) = split ' ', $string; is actually my ($id, undef) = split ' ', $string, 3;. The benchmark rewritten as:

use strict;
use warnings;

use Benchmark qw{cmpthese};

my $string = ' J00153:42:HC5NCBBXX:6:1101:10896:14959   99  gnl|Btau_4
+.6.1|chr16    72729218    1   12M';

cmpthese 1e7 => {
    limit_undef => sub { my ($id,) = split ' ', $string, 2 },
    nolim_undef => sub { my ($id,) = split ' ', $string },
    limit_array => sub { my ($id, @rest) = split ' ', $string, 2 },
    nolim_array => sub { my ($id, @rest) = split ' ', $string },
};
[download]

gives the result:

                 Rate nolim_array limit_array limit_undef nolim_undef
nolim_array  573888/s          --        -59%        -67%        -67%
limit_array 1396453/s        143%          --        -20%        -21%
limit_undef 1746725/s        204%         25%          --         -1%
nolim_undef 1760873/s        207%         26%          1%          --
[download]

So you can ommit the limit when splitting to a fixed-size list.

Comment on Re^2: Hash searching Select or Download Code

Replies are listed 'Best First'.
Re^3: Hash searching by kcott (Archbishop) on Aug 31, 2016 at 04:56 UTC
++ Thanks for the additional information and benchmarks. — Ken	[reply]