Re^4: Concerning hash operations (appending, concatenating)

Replies are listed 'Best First'.
Re^5: Concerning hash operations (appending, concatenating) by japhy (Canon) on Mar 24, 2005 at 18:24 UTC
Here's my benchmark: `use Benchmark 'cmpthese'; use strict; my %hash = 1 .. 10_000; my %add = 9_001 .. 11_000; cmpthese(-5, { BASE => sub { my %copy = %hash; }, JAPHY => sub { my %copy = %hash; my @new = grep !exists $copy{$_}, keys %add; @copy{@new} = @add{@new}; }, ZAXO => sub { my %copy = %hash; %copy = (%add, %copy); }, NULL => sub { my %copy = %hash; %copy = (%copy); }, });` [download] Here are the results I get: `Rate ZAXO NULL JAPHY BASE ZAXO 15.1/s -- -22% -61% -76% NULL 19.3/s 28% -- -50% -69% JAPHY 38.9/s 157% 101% -- -38% BASE 62.4/s 312% 223% 60% --` [download] The two cases 'BASE' and 'NULL' show that Perl does not optimize `%hash = %hash` at all, since 'NULL' is three times slower than 'BASE'. 'JAPHY' is 2.5 times faster than 'ZAXO' in this run, with 500 old keys and 500 new keys. When I change %add to be (5_001 .. 12_000), the run times are: `Rate ZAXO NULL JAPHY BASE ZAXO 11.4/s -- -39% -58% -81% NULL 18.6/s 63% -- -31% -69% JAPHY 26.9/s 136% 45% -- -56% BASE 60.5/s 431% 225% 125% --` [download] The point is that `%a = (%b, %a)` is slow because it is O(N+M) where N is the size of %a and M is the size of %b. My code is O(M) -- specifically, O(M(1+x)) where x is the fraction of keys in %b that aren't in %a -- because it doesn't iterate over %a at all, only %b. _____________________________________________________ Jeff `japhy` Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and `perl` hacker How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart	[reply] [d/l] [select]
Re^6: Concerning hash operations (appending, concatenating) by Tanktalus (Canon) on Mar 24, 2005 at 18:45 UTC
The grep is superfluous and, IMO, only distracts from the code. Here's the sub to add to the above benchmark: `TANKTALUS => sub { my %copy = %hash; @copy{keys %add} = values %add; },` [download] And the benchmarks. First, the original 500/500 overlap/new keys: `Rate ZAXO NULL TANKTALUS JAPHY BASE ZAXO 12.3/s -- -23% -59% -59% -76% NULL 16.1/s 30% -- -46% -47% -68% TANKTALUS 29.8/s 142% 85% -- -2% -41% JAPHY 30.4/s 147% 89% 2% -- -40% BASE 50.9/s 313% 216% 71% 67% --` [download] As you can see, the difference is pretty insignificant compared to the extra "line noise" that needs to be parsed by the human who maintains it. I didn't think it'd change much with a larger sample, but I tried it anyway (using your 5_001..12_000 hash): `Rate ZAXO NULL JAPHY TANKTALUS BASE ZAXO 9.48/s -- -39% -55% -55% -81% NULL 15.7/s 65% -- -26% -26% -69% JAPHY 21.1/s 122% 35% -- -0% -59% TANKTALUS 21.1/s 123% 35% 0% -- -59% BASE 51.1/s 439% 226% 142% 142% --` [download] Somewhere, deep down in the lost precision, the non-grep version comes out ahead. Which we'll attribute to statistical insigificance, just as I did for the 2% in the other direction. ;-) The other point is that your answer does not do the same thing as the rest of the examples here. In all the rest of the cases, the new hash always overrides the values in the old hash. That is, if the old hash had a => "b", while the new hash had a => "c", the behaviour of yours vs the rest is different. Which one is correct is dependant on the demands of the situation, although I usually lean towards new hashes winning over old hashes the way I write the code. In cases where I want "first come, first served" policies, then I will definitely keep your grep in mind - it's probably a lot more elegant than the way I have been doing it (if I have - I'd have to look through old code to be sure). Thanks ;-)	[reply] [d/l] [select]
Re^7: Concerning hash operations (appending, concatenating) by japhy (Canon) on Mar 24, 2005 at 18:50 UTC
You missed the point, Tanktalus. We're only adding NEW key-values pairs to the hash, thus the grep() is required. The node by Zaxo was showing how to have the original values override the new values if the keys matched. _____________________________________________________ Jeff `japhy` Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and `perl` hacker How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart	[reply]