Re^3: Optimization, side-effects and wrong code

keys builds the whole list prior to usage

I thought I remember reading somewhere that keys has an optimization in this case, and the whole list is not actually built up front. I couldn't find any documentation, so I decided to do a few tests. Perhaps my test code is flawed in some way I don't see, but the results seem to agree with my premise.

Here are my snippets:

Snippet a: perl -le '%hash = (1 .. COUNT); print ">"; <STDIN>'
Snippet b: perl -le '%hash = (1 .. COUNT); @keys = keys %hash; print ">"; <STDIN>'
Snippet c: perl -le '%hash = (1 .. COUNT); for (keys %hash) { print ">"; <STDIN> }'

Here's a table with the results:

COUNT	a mem	b mem	c mem	a-b change	a-c change
100,000	11,612	13,008	12,212	1,396 (12.0%)	600 (5.17%)
1,000,000	97,320	111,288	103,324	13,968 (14.4%)	6,004 (6.17%)

Update: I should have mentioned earlier, these are from perl 5.005_03 on FreeBSD 4.9-STABLE. I got the value from the VSZ column in ps aux output.

The change from snippet a to b is much larger than the change from a to c. This seems consistent with the optimization I thought should occur. There may be overhead in the array @keys, but I wouldn't think it would be so much more than building an equivalent list. Maybe someone can explain this another way?

Comment on Re^3: Optimization, side-effects and wrong code Select or Download Code

Replies are listed 'Best First'.

Re^4: Optimization, side-effects and wrong code
by leriksen (Curate) on Sep 30, 2004 at 02:28 UTC

Can you tell us the method you used to get the memory figures ? And your platform/perl details too ?
I ran your snippets on my RH9 with perl, v5.8.0 built for i386-linux-thread-multi and I made one change to snippet C to

 perl -le '%hash = (1 .. 1000000); for (keys %hash) { print ">"; <STDI
+N> ; exit}'
[download]

I just used top to look at the Size column, and my results are

COUNT a mem b mem c mem a-b change a-c change

100,000 10,960 11,984 11,004 1,024 (9.3%) 44 (0.4%)

1,000,000 94,324 101,000 94,460 6,676 (7.1%) 136 (0.1%)

Note the top in RH9 truncates snippet B, 1,000,000 items to 101M, and I haven't figured out how to change the units for this field - perhaps I need to use something other than top...

So in conclusion, I agree, it does appear there is some optimisation going on - is this an example of lazy evaluation ?

use brain;

[reply]
[d/l]