Pick any key from a very large hash

FloydATC has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Pick any key from a very large hash by mzedeler (Pilgrim) on Jul 11, 2009 at 21:55 UTC
Did you try the `seq` method in DB_File?	[reply] [d/l]
Re^2: Pick any key from a very large hash by FloydATC (Deacon) on Jul 11, 2009 at 22:30 UTC
No, to be honest I did't understand what it was for until you suggested it. The syntax had me assuming I had to pass a valid key as a starting point, but examining the example code closer revealed the truth. I gave it a whack and it works perfectly. Thank you :-) -- Time flies when you don't know what you're doing	[reply]
Re: Pick any key from a very large hash by jwkrahn (Abbot) on Jul 11, 2009 at 22:23 UTC
The documentation for each explains how to reset it (hint: third paragraph).	[reply]
Re^2: Pick any key from a very large hash by FloydATC (Deacon) on Jul 11, 2009 at 22:25 UTC
Yes, by using `keys` or `values`... exactly what I don't want :-) -- Time flies when you don't know what you're doing	[reply] [d/l] [select]
Re^3: Pick any key from a very large hash by hobbs (Monk) on Jul 12, 2009 at 11:25 UTC
"In particular, calling keys() in void context resets the iterator with no other overhead."	[reply]
Re^4: Pick any key from a very large hash by QM (Parson) on Jul 15, 2009 at 21:36 UTC
Re^3: Pick any key from a very large hash by jwkrahn (Abbot) on Jul 11, 2009 at 22:27 UTC
Or: "it can be reset by reading all the elements from the hash"	[reply]
Re^2: Pick any key from a very large hash by psini (Deacon) on Jul 11, 2009 at 22:27 UTC
I don't think your suggestion is a viable option when the hash contains millions of keys... Rule One: "Do not act incautiously when confronting a little bald wrinkly smiling man."	[reply]
Re^3: Pick any key from a very large hash by ikegami (Patriarch) on Jul 13, 2009 at 19:18 UTC
Calling `keys` in void context resets the iterator and does nothing else. The number of keys in the hash is of no import.	[reply] [d/l]
Re^4: Pick any key from a very large hash by psini (Deacon) on Jul 13, 2009 at 21:50 UTC
Re: Pick any key from a very large hash by Bloodnok (Vicar) on Jul 11, 2009 at 22:26 UTC
...`each` is meant for looping through the entire hash... - you're dead right - the hash is meant to be invariant over `each` - from each '...If you add or delete elements of a hash while you're iterating over it, you may get entries skipped or duplicated, so don't...'. In answer to your question, again from each - '..it can be reset by reading all the elements from the hash, or by evaluating `keys HASH` or `values HASH`...'. The moral of the story/posting is ... please RTFM before you post. A user level that continues to overstate my experience :-))	[reply] [d/l] [select]
Re^2: Pick any key from a very large hash by FloydATC (Deacon) on Jul 11, 2009 at 22:36 UTC
Hehe I did RTFM but you know what happens when you get stuck... you start trying creative stuff even if you know it might break things in funny ways :-) My theory was that maybe there was some obscure way to manually whack it back to zero. Guess not. Anyway, the problem is solved :-) -- Time flies when you don't know what you're doing	[reply]
Re^3: Pick any key from a very large hash by GrandFather (Saint) on Jul 11, 2009 at 23:21 UTC
I realise it's not an issue now, but it seems likely that calling keys in scalar context will reset the each iterator in (very small) constant time. Indeed a small test confirms this: `use strict; use warnings; use Benchmark qw(cmpthese); my %bigHash = map {$_ => undef} 1 .. 1e3; my @resetEach = keys %bigHash; my ($key) = each %bigHash; my $count = keys %bigHash; my ($key2) = each %bigHash; print "First key '$key', second '$key2'\n"; cmpthese ( -1, { scalarKeys => sub {scalar keys %bigHash}, listKeys => sub {() = keys %bigHash}, } );` [download] Prints: `First key '559', second '559' Rate listKeys scalarKeys listKeys 5184/s -- -100% scalarKeys 15969899/s 307964% --` [download] True laziness is hard work	[reply] [d/l] [select]
Re^4: Pick any key from a very large hash by FloydATC (Deacon) on Jul 12, 2009 at 06:58 UTC
Re: Pick any key from a very large hash by ashish.kvarma (Monk) on Jul 12, 2009 at 04:28 UTC
If you don�t want to use "keys", here is a fancy way to do it. my %hash = ( a => 1, b => 2, c => 3, d => 4, e => 5, f => 6, g => 7, h => 8, i => 9, j => 10, ); my $hash_length = %hash; # To get actual length you would need 'scalar keys %hash' # But we are not using 'keys' here so lets get the length other way $hash_length =~ s{/.*}{}; foreach (1...10) { my $rand_num = rand $hash_length; # Make sure number is even. $rand_num = ($rand_num %2) ? ($rand_num - 1) : ($rand_num); my $key = (%hash)[$rand_num]; print "$_\t: $key\n"; } # output # 1 : e # 2 : d # 3 : c # 4 : j # 5 : a # 6 : e # 7 : e # 8 : c # 9 : j # 10 : e [download] Regards, Ashish	[reply] [d/l]
Re^2: Pick any key from a very large hash by GrandFather (Saint) on Jul 12, 2009 at 05:11 UTC
OP's concern was not the use of keys, but the potential cost of using it due to a perceived 'temporary array'. However various solutions to that problem have been suggested. Let's see how yours stacks up: `use strict; use warnings; use Benchmark qw(cmpthese); my %bigHash = map {$_ => undef} 1 .. 1e3; my ($key) = each %bigHash; my $count = keys %bigHash; my ($key2) = (%bigHash)[0]; print "First key '$key', second '$key2'\n"; cmpthese ( -1, { byEach => sub {scalar keys %bigHash; my ($key) = each %bigHas +h;}, byArray => sub {my $key = (%bigHash)[0]}, } );` [download] Prints: `First key '559', second '559' Rate byArray byEach byArray 5228/s -- -100% byEach 2414187/s 46082% --` [download] It's that 'temporary array' OP was worried about. A fancy solution no doubt but, for saving time, not a good one. True laziness is hard work	[reply] [d/l] [select]
Re^3: Pick any key from a very large hash by ashish.kvarma (Monk) on Jul 12, 2009 at 10:53 UTC
I jumped in without giving it a thought and missed the real problem. I do agree it wouldn�t be efficient therefore won�t solve the purpose. I guess I gotta give some time before jumping in. Regards, Ashish	[reply]
Re^2: Pick any key from a very large hash by FloydATC (Deacon) on Jul 12, 2009 at 06:50 UTC
Still, it's a creative solution to the problem, and within the original domain. I'm sure this technique can be helpful when dealing with a big native hash and not a DB_File. -- Time flies when you don't know what you're doing	[reply]
Re: Pick any key from a very large hash by psini (Deacon) on Jul 11, 2009 at 22:01 UTC
Can't you simply `exit` from the `each` when you are done, and maybe include it in an external loop of some kind? Or have I completely missed the point? Rule One: "Do not act incautiously when confronting a little bald wrinkly smiling man."	[reply] [d/l] [select]
Re^2: Pick any key from a very large hash by FloydATC (Deacon) on Jul 11, 2009 at 22:24 UTC
I tried using `each` without looping at all, which caused weird results when arbitrary pairs were added or deleted between each `each`. `exit` would be taking things a bit too far, perhaps you meant `last`? :-) Alternatively, if you mean use a separate process, this would defeat the DB_File built-in cache and generally kill performance. OR maybe I'm the one missing the point... -- Time flies when you don't know what you're doing	[reply] [d/l] [select]


Just another Perl shrine
	PerlMonks