Populating a hash

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Populating a hash by ww (Archbishop) on Mar 11, 2012 at 13:18 UTC
"most efficient implementation?" Most efficient by what measure? ...programmer effort? ...RAM usage? ...storage requirements (if any)? Please clarify your question.	[reply]
Re^2: Populating a hash by Anonymous Monk on Mar 11, 2012 at 15:13 UTC
what i had in mind was lower code lineage - being hopefully reflected in cpu utilization	[reply]
Re^3: Populating a hash by ww (Archbishop) on Mar 11, 2012 at 16:19 UTC
Anytime you have to preface a desired result with "hopefully," you have a wish, not a plan. Some suggestions for alternate coding are surely forthcoming ... but my suggestion is "benchmark;" don't just hope, know. Fewer lines does emphatically NOT `eq` "faster" code or "lower cpu utilization." There's so much going on behind the green curtain (cf Wizard of Oz) that benchmarking or intimate familiarity with perlguts (which labels itself only as an "introduction" to the Perl API) and the deeper reaches of Perl wisdom and `C` will you be able to confidently predict any particular case. And not just BTW, it's my suspicion that Perl will usually do a better job of optomizing-away bits of my code than I will.	[reply] [d/l] [select]
Re^2: Populating a hash by ansh batra (Friar) on Mar 11, 2012 at 13:57 UTC
ww: how can you change the stograge style of a data structure? and whenever it comes to efficiency, i dnt think that programmer effort is taken into picture. I have always taken efficiency as cpu and time consumption	[reply]
Re^3: Populating a hash by ww (Archbishop) on Mar 11, 2012 at 14:38 UTC
storage? any one of many compression techniques. Think '.zip' or RLE or bit vectors or .... efficiency? Your usual choices are usually valid, but would you consider them the best measure of the "efficiency" of producing a one-off or rarely used program? Being concerned solely with "cpu and time^Update1 consumption" and ignoring "programmer time" might be a serious case of premature optimization. Update 1: I assume from context that the "time" reference meant "run time."	[reply]
Re^3: Populating a hash by Anonymous Monk on Mar 11, 2012 at 14:36 UTC
how can you change the stograge style of a data structure? use Storable i dnt think that programmer effort is taken into picture. I have always taken efficiency as cpu and time consumption programmer effort consumes time	[reply]
Re: Populating a hash by RichardK (Parson) on Mar 11, 2012 at 13:05 UTC
I don't know. Have you measured it? You could use Benchmark to test it against other implementations and see.	[reply]
Re: Populating a hash by morgon (Priest) on Mar 11, 2012 at 15:39 UTC
I believe (not 100% sure though) that in your case (as you are putting one million keys into your hash and I assume you want to get them out again later) you should preallocate space by telling perl you need more buckets -i.e. `my %hash; keys(%hash)=1_000_000;` [download] The number you assign to keys is rounded up to the next power of two. My understanding is (I would be interested to learn if this is correct) that when you have more buckets you have less collisions (i.e. entries that fall into the same bucket) and so retrieval of an entry is more efficient.	[reply] [d/l]
Re^2: Populating a hash by BrowserUk (Patriarch) on Mar 11, 2012 at 15:43 UTC
Because of the way $a*$b works, he's creating somewhat less that 250,000 keys. Many are incremented multiple times. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. The start of some sanity?	[reply]
Re^3: Populating a hash by morgon (Priest) on Mar 11, 2012 at 15:50 UTC
You are right of course. And all the primes above 1000 don't get incremented at all... So would preallocating buckets be helpful or hurtful here (or would it not matter at all)?	[reply]
Re^4: Populating a hash by BrowserUk (Patriarch) on Mar 11, 2012 at 16:58 UTC