in reply to Re: Computing Subsets
in thread Computing Subsets
Re-run your benchmark after making a few changes:
sub isSubsetHash { my $count = my @small = split ':', $_[0]; my %big; @big{ split( ':', $_[1] ) } = (); exists( $big{$_} ) && $count-- for @small; !$count; }
Removing say from the hash technique removes senseless IO operations that the other two techniques aren't testing.
The rewrite of the hash technique internalizes the loop used to build the hash, and takes advantage of the fact that an exists check is cheaper than a value check.
The increase of the data set size shows what the trend is as you try to upscale the different solutions. It's unlikely that the OP has a data set so small as only five elements in the haystack, and two in the needles set. In fact, it's unlikely that it's as small as 15 and 6. But it doesn't take much upscaling for the results of your benchmark to flip dramatically.
Dave
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Computing Subsets
by Kenosis (Priest) on May 28, 2012 at 06:41 UTC | |
by davido (Cardinal) on May 28, 2012 at 06:56 UTC |