Re^2: Efficient way to handle huge number of records?

Only 5 million records because I have an image analysis that has been running for 2 days now using over half my memory and I don't want to interrupt it.

I retrieved the first 1000 records from the hash -- as near random as makes no difference -- because 10 wasn't representative. Retrieval time is 0.4 millisecs per.

The code:

#! perl -slw
use strict;
use Time::HiRes qw[ time ];

our $N //= 1000;

my $start = time;
my %dna; keys %dna = 5e6;
$dna{ <> } = <> until eof();
printf "Loading %d records took %d seconds\n",
    scalar keys %dna, time() - $start;

$start = time();
my( $id, $seq ) = each %dna for 1 .. $N;

printf "Took %.6f seconds per retrieval\n",
    ( time() - $start ) / $N;

__END__
[22:14:18.06] C:\test>junk48 -N=1e3 junk.dat
Loading 5000000 records took 37 seconds
Took 0.000414 seconds per retrieval
[download]

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

Comment on Re^2: Efficient way to handle huge number of records? Download Code