Re^2: How do I measure my bottle ?

Thanks for your rapid reply, my test script is listed as following:

script 1:

while(my $l=<IN>){
}
[download]

script 2:

while(my $l=<IN>){
   my $id=substr($l,0, 33);
   $hash{$id}=1;
}
[download]

script 1 takes me around 15 secs where as script2 takes me 20 secs.

Comment on Re^2: How do I measure my bottle ? Select or Download Code

Replies are listed 'Best First'.
Re^3: How do I measure my bottle ? by RazorbladeBidet (Friar) on Mar 25, 2005 at 13:50 UTC
Then your hash insert is only taking 5 seconds (all other things being equal). There is the memory consideration, also (as stated below). Is this 20M records totalling 1GB or 1 TeraByte? (You mention 1,000 GB in your original post). Is there a reason you are using a hash? (in your example it looks like you could use an array, but I understand it is merely a "test") If you have many files (and it sounds like you do) - you could slurp in the entire file (one file at a time) and do the inserts, which will increase your memory usage but decrease CPU time. See File::Slurp -------------- "But what of all those sweet words you spoke in private?" "Oh that's just what we call pillow talk, baby, that's all."	[reply]
Re^3: How do I measure my bottle ? by tlm (Prior) on Mar 25, 2005 at 14:57 UTC
If you want a handle on the cost of hash inserts, you may as well make the comparison more precise by making script 1 do something like: `while(my $l=<IN>){ my $id=substr($l,0,33); $hash=1; }` [download] (Assuming of course, the compiler doesn't optimize any of this away.) the lowliest monk	[reply] [d/l]