Re-run your benchmark after making a few changes:
sub isSubsetHash { my $count = my @small = split ':', $_[0]; my %big; @big{ split( ':', $_[1] ) } = (); exists( $big{$_} ) && $count-- for @small; !$count; }
Removing say from the hash technique removes senseless IO operations that the other two techniques aren't testing.
The rewrite of the hash technique internalizes the loop used to build the hash, and takes advantage of the fact that an exists check is cheaper than a value check.
The increase of the data set size shows what the trend is as you try to upscale the different solutions. It's unlikely that the OP has a data set so small as only five elements in the haystack, and two in the needles set. In fact, it's unlikely that it's as small as 15 and 6. But it doesn't take much upscaling for the results of your benchmark to flip dramatically.
Dave
In reply to Re^2: Computing Subsets
by davido
in thread Computing Subsets
by grandpascorpion
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |