selecting random key from hash

davidj has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
•Re: selecting random key from hash by merlyn (Sage) on Jul 13, 2004 at 02:53 UTC
As you show it, that's the most efficient way to do it. So, you are saying that you want to do it less efficiently, but optimized for something else. What else is it that you're optimizing for? If memory space, then you can wander over the list of keys using `each` without ever extracting all the keys at once, selecting one at random: `my %hash = (populated somehow); my ($victim_key, $victim_value); for (0..int rand (scalar keys %hash)) { ($victim_key, $victim_value) = each %hash; } scalar keys %hash; # reset the iterator` [download] That'll get you the item without ever having built the entire list of keys in memory. But if that's not what you are optimizing for, you should clarify your question. If it's simply "don't use an array", then I call homework on the question (usually signalled by a bizarre "don't use X" where "X" is the most efficient way to do something). -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply.	[reply] [d/l]
Re^2: selecting random key from hash by Fletch (Bishop) on Jul 13, 2004 at 03:12 UTC
You could also use a variation on the random line from a file FAQ: `my( $key, $val ); my $ctr = 0; while( my( $k, $v ) = each %hash ) { rand( $ctr++ ) < 1 && do { $key = $k; $val = $v }; }` [download] Actually merlyn's is more efficient since it only reads up to the "victim" element, whereas this would hit all elements (which it has to for the original random line from a file version). Then again if this is a tied hash and FETCH had side-effects that might be what you want. At any rate see `perldoc -q "random line"` for more info.	[reply] [d/l] [select]
•Re^3: selecting random key from hash by merlyn (Sage) on Jul 13, 2004 at 03:20 UTC
Yeah, I actually thought of that first, and then discounted it for the reasons you mentioned. The big difference is that for the random-line solution, we have no cheap way of determining the entire set size, where with a hash, scalar-keys is cheap and tells us what we need to know. -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply.	[reply]
Re^2: selecting random key from hash by davidj (Priest) on Jul 13, 2004 at 04:12 UTC
Homework? I haven't had homework in almost 15 years. I didn't know it was the most efficient way. That's one of the reasons I asked the question. With the types of programs I typically write, I 1) use hashes store data in complex data structures and 2) have to randomly select data from the hash. I have always used the method I described in my post, but have been curious if there are other ways it could be done. (I started using Perl about a year ago, but not on a regular basis until about 5 months ago. So it is still pretty new to me). If the method I am using is the most efficient, then great. I will continue to use it. Thank you all for the input. davidj	[reply]
Re: selecting random key from hash by BrowserUk (Patriarch) on Jul 13, 2004 at 03:04 UTC
Essentially the same as Merlyn's, but I like this method of dealing with the undef produced by the each iterator wrapping. `#! perl -slw use strict; my %hash; @hash{ 'a' .. 'z' } = 'A' .. 'Z'; for( 1 .. 20 ){ my @four = map{ each %hash for 1 .. rand( keys %hash ); each %hash \|\| each %hash } 1 .. 4; print "@four"; } __END__ P:\test>373821 o a y u v i h i j i o x o g l g k a d d i n m s i v n l c d b n j p z e b n v h q r i u y m s r h s j k d u j v y j h i w l c w y v o o z m n k p w o h u v m a` [download] Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon	[reply] [d/l]
Re: selecting random key from hash by duff (Parson) on Jul 13, 2004 at 03:45 UTC
Your criteria are curious. You could do something like this: `$random_element = (keys %hash)[rand keys %hash];` [download] That generates a list of all the keys each time you do it. (but it doesn't use an array! :-) duff	[reply] [d/l]
Re: selecting random key from hash by BrowserUk (Patriarch) on Jul 13, 2004 at 04:39 UTC
Another thought. If you are using 5.8.1 or later, then you shouldn't have to do use rand to get a random selection of hash keys. Using any of keys, values or each should give you a randomized sequence each time you call them and different for each run. However, my AS builds all seem to have this security feature explicitly disabled, and all my attempts to enable it on a self-built copy of 5.8.4 come to naught so far. I know what I'm supposed to set, but the deltas , POD and comments on the subject only refer to doing this with a Configure time option which is only applicable to *nix. Anyone know how to do this for win32? Becasue it certainly doesn't seem to enabled by default, which the docs suggest it should be. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon	[reply]
Re^2: selecting random key from hash by hv (Prior) on Jul 13, 2004 at 09:23 UTC
In latest perls, the hashing seed used is randomised to avoid certain types of DOS attacks. However, it is not randomised at each use - instead, each time a new key is added to the hash perl checks whether the hash is becoming heavily unbalanced (either due to an injection attack or due to simple bad luck), and re-hashes only if it is. In normal use, the rehashing should rarely happen. And that's just as well because it is relatively expensive. (Before you ask, I'd add that this is not vulnerable to a higher level DOS attack that forces constant rehashing unless the attacker can predict the new random hash seed that will be selected.) Hugo	[reply]
Re^3: selecting random key from hash by BrowserUk (Patriarch) on Jul 13, 2004 at 09:37 UTC
Thanks Hugo. I remember the discussion from when it came up originally. I even came up with a suggestion for fixing it that wasn't a million miles away from what got implemented, though I doubt my half-cocked idea was any influence :) Given the OP was picking 4 at a time from a "very large hash", the inherent randomness (assuming he has it enabled (or can work out how to enable it!! Argggh!)), might be sufficiently random for his purpose. It would have an advantage over the other solutions presented, in that it would avoid duplicates whilst ensuring full coverage. If that is a requirement. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon	[reply]
Re: selecting random key from hash by spurperl (Priest) on Jul 13, 2004 at 07:44 UTC
Monk conscience speaking: Don't forget to 'use warnings'	[reply]
Re: selecting random key from hash by rrwo (Friar) on Jul 13, 2004 at 21:45 UTC
Why do you need a random key from the hash? How many do you need? The order of hash keys is for all intents and purposes "random" (unordered, with a security randomization added in Perl 5.8.1 I think), so for some purposes you can just take the first n hash keys... For some purposes, this is good enough.	[reply]
Re: selecting random key from hash by jimrobertsiii (Scribe) on Jul 15, 2004 at 03:33 UTC
Long time listener, first time poster. Not sure about effiency - I'm just always looking for an excuse to use a psuedo :-) Consider the following: `$rnd_phash = [ {black=>1, green=>2, blue=>3, yellow=>4, red=>5, purple=>6}, 'black door', 'green grass', 'blue devil', 'yellow belly', 'red ro +ses', 'purple grape' ]; # array in psuedo hash is 1 relative not 0 relative $i = int rand $#{$rnd_phash} +1; print "Random hash element: $$rnd_phash[$i]\n"; print "The value of \$\$rnd_phash{red} is $$rnd_phash{red}\n"; print "The value of \$\$rnd_phash{purple} is $$rnd_phash{purple}\n";` [download] -Jim	[reply] [d/l]