in reply to Re^2: Optimization, side-effects and wrong code
in thread Optimization, side-effects and wrong code

keys builds the whole list prior to usage

I thought I remember reading somewhere that keys has an optimization in this case, and the whole list is not actually built up front. I couldn't find any documentation, so I decided to do a few tests. Perhaps my test code is flawed in some way I don't see, but the results seem to agree with my premise.

Here are my snippets:

Snippet a

perl -le '%hash = (1 .. COUNT); print ">"; <STDIN>'

Snippet b

perl -le '%hash = (1 .. COUNT); @keys = keys %hash; print ">"; <STDIN>'

Snippet c

perl -le '%hash = (1 .. COUNT); for (keys %hash) { print ">"; <STDIN> }'

Here's a table with the results:

COUNTa memb memc mema-b changea-c change
100,00011,612 13,008 12,212 1,396 (12.0%) 600 (5.17%)
1,000,00097,320111,288103,32413,968 (14.4%)6,004 (6.17%)

Update: I should have mentioned earlier, these are from perl 5.005_03 on FreeBSD 4.9-STABLE. I got the value from the VSZ column in ps aux output.

The change from snippet a to b is much larger than the change from a to c. This seems consistent with the optimization I thought should occur. There may be overhead in the array @keys, but I wouldn't think it would be so much more than building an equivalent list. Maybe someone can explain this another way?

Replies are listed 'Best First'.
Re^4: Optimization, side-effects and wrong code
by leriksen (Curate) on Sep 30, 2004 at 02:28 UTC
    Reverend,

    Can you tell us the method you used to get the memory figures ? And your platform/perl details too ?
    I ran your snippets on my RH9 with perl, v5.8.0 built for i386-linux-thread-multi and I made one change to snippet C to

    perl -le '%hash = (1 .. 1000000); for (keys %hash) { print ">"; <STDI +N> ; exit}'

    to ease the snippets death.

    I just used top to look at the Size column, and my results are

    COUNTa memb memc mema-b changea-c change
    100,00010,960 11,984 11,004 1,024 (9.3%) 44 (0.4%)
    1,000,00094,324101,00094,4606,676 (7.1%)136 (0.1%)

    Note the top in RH9 truncates snippet B, 1,000,000 items to 101M, and I haven't figured out how to change the units for this field - perhaps I need to use something other than top...

    So in conclusion, I agree, it does appear there is some optimisation going on - is this an example of lazy evaluation ?

    use brain;