in reply to Re: Re: Re: Perl's pearls
in thread Perl's pearls
it becomes almost 20% faster.print map {"@$_\n" } sort {$a->[0] cmp $b->[0]} grep { @$_ > 1} values %words;
This script touches the hash three times directly plus two times conditionally. The first access is made with the exists function. If this test is true, two more accesses to the hash are performed, but only to those items that have anagrams or duplicates. In our case, about 15% of the items). Then we access the hash to insert the words and to get the results. Only once.while (<>) { chomp; $_ = lc $_; $signature = pack "C*", sort unpack "C*", $_; if (exists $words{$signature}) { next if $words{$signature} =~ /\b$_\b/; $words{$signature} .= " "; } $words{$signature} .= $_; } print join "\n", sort grep {tr/ //} values %words; print "\n";
The above code fragment shows the effects of string interpolation. An array is merged into a string with its items separated by a space. This is standard Perl behavior. This operation is roughly the same as using join on the array explicitly and this fact should account for the slower performance.my @array = qw(one two three); print @array; # output : 'onetwothree' # it's the same as foreach(@array) {print $_} print "@array"; # output : 'one two three' # it's the same as print join " ", @array;
#!/usr/bin/perl -w use strict; use Benchmark; my $iterations = 200_000; my %with_str; # hash containing strings my %with_arr; # hash containing arrays my $strcount = 0; # counter for hash of strings my $arcount = 0; # counter for hash of arrays my ($constant1, $constant2) = ("abcd", "dcba"); # strings used to fill the items timethese ($iterations, # inserts two elements per each hash value { "insert string" => sub { $with_str{$strcount} .= "$constant1$strcount"; $with_str{$strcount++} .= " $constant2$strcount" }, "push array" => sub { push @{$with_arr{$arcount}}, "$constant1$arcount"; push @{$with_arr{$arcount++}}, "$constant2$arcount" } }); my $count = 0; $arcount = 0; $strcount = 0; timethese ($iterations, # counts items for each hash value { "count string items" => sub { $count = $with_str{$strcount++} =~ tr/ //; }, "count array items" => sub { $count = scalar @{$with_arr{$arcount++}} } }); $arcount = 0; $strcount = 0; my $output = ""; timethese ($iterations, # string interpolation { "fetch string" => sub { $output = "$with_str{$strcount++}" }, "fetch array" => sub { $output = "@{$with_arr{$arcount++}}" } }); $count = 0; $arcount = 0; $strcount = 0; timethese ($iterations, # access separate items { "items from string" => sub { foreach (split / /, $with_str{$strcount}) { $output = $_; } $strcount++; }, "items from array" => sub { foreach ( @{$with_arr{$arcount}}) { $output = $_; } $arcount++; } }); =pod Benchmark: timing 200000 iterations of insert string, push array... insert string: 3 wallclock secs ( 1.92 usr + 0.14 sys = 2.06 C +PU) push array: 3 wallclock secs ( 2.39 usr + 0.15 sys = 2.54 C +PU) timing 200000 iterations of count array items, count string items... count string items: 2 wallclock secs ( 0.83 usr + 0.00 sys = 0.83 C +PU) count array items: 0 wallclock secs ( 0.64 usr + 0.00 sys = 0.64 C +PU) timing 200000 iterations of fetch array, fetch string... fetch string: 1 wallclock secs ( 0.59 usr + 0.00 sys = 0.59 C +PU) fetch array: 2 wallclock secs ( 1.13 usr + 0.00 sys = 1.13 C +PU) timing 200000 iterations of items from array, items from string... items from string: 2 wallclock secs ( 2.65 usr + 0.07 sys = 2.72 C +PU) items from array: 1 wallclock secs ( 2.02 usr + 0.07 sys = 2.09 C +PU) totals (inserting, counting items in each hash value, and fetching all the values at once) string: 3.34 array : 4.16 totals (inserting, counting items, and fetching items one by one from each hash value) string: 5.40 array : 5.05 =cut
_ _ _ _ (_|| | |(_|>< _|
|
|---|