in reply to Re: Efficient way to handle huge number of records?
in thread Efficient way to handle huge number of records?

Only 5 million records because I have an image analysis that has been running for 2 days now using over half my memory and I don't want to interrupt it.

I retrieved the first 1000 records from the hash -- as near random as makes no difference -- because 10 wasn't representative. Retrieval time is 0.4 millisecs per.

The code:

#! perl -slw use strict; use Time::HiRes qw[ time ]; our $N //= 1000; my $start = time; my %dna; keys %dna = 5e6; $dna{ <> } = <> until eof(); printf "Loading %d records took %d seconds\n", scalar keys %dna, time() - $start; $start = time(); my( $id, $seq ) = each %dna for 1 .. $N; printf "Took %.6f seconds per retrieval\n", ( time() - $start ) / $N; __END__ [22:14:18.06] C:\test>junk48 -N=1e3 junk.dat Loading 5000000 records took 37 seconds Took 0.000414 seconds per retrieval

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?