in reply to Re^2: Optimize my code with Hashes
in thread Optimize my code with Hashes

You are asking the wrong questions. You already blame the hashes without really knowing what wastes all this time. Everyone above told you to do some profiling first and that is a really good idea. Often it is the algorithm used that kills the time.

For example: If your program often makes a copy of your hash then that will really thrash your memory and cost time. But you won't get it faster by changing to an array then.

Also you didn't show us what the code does with the hash you blame. Then how should we be able to tell you if an array ist better

So either post some code here or do some profiling or both.

Replies are listed 'Best First'.
Re^4: Optimize my code with Hashes
by sukhicool (Initiate) on Aug 27, 2008 at 12:13 UTC
    Sorry if anyone is hurt by my wrong questions.

    Let me show some code:
    The following hash is getting generated from a subroutine: %pfinfo = { '069836' => '069836|Henion,David|A|Active|010474|HAWKEY,Mi +chael G|SC3798|...' , '025939' => '025939|Picard, Stephane|A|Active|010101|LEPINE,Thi +bault|SG8778|...' , ...} my $timee0 = new Benchmark; foreach my $en (keys %pfinfo) { logAndSkip(\*LOG,"Considering the entry from PeopleFirst extract: $e +n...") if ($log); # Get the PF information my @pfi=(); #reset the array @pfi=split/\|/,$pfinfo{$en}; # If employee number does not exists in ED, it looks like a creation if (!exists $ed_en{$en}) { createEDentry(\*LOG,\@pfi,\%used_dn,\%en2dn); } # Looks like an ED entry update else { updateEDentry(\*LOG,$en2dn{$en},\@pfi,\%en2dn); } } # End Foreach

      Sorry if my answer sounded too harsh, I'm not a native english speaker myself

      The part of the code you show seems to be quite efficient and from what I can see the author of this code knows how to program in perl. I even tried out the program (the bit you posted) on my machine and it needed just over 1 second (3 seconds on a sun blade 100) to init the hash with 50000 bogus entries and run the loop over it. Naturally with empty subroutines, so no surprise really.

      What you don't show is what createEDentry and updateEDentry do. Probably that is where most of the work is done.

      If you want to try yourself, here is my test code. Run it on your machine, if it takes less than 20 seconds, the problem is not in the code you have shown us.

      my %used_dn=(); my %en2dn=(); my %ed_en; my %pfinfo=(); my $i=50000; while ($i>0) { $pfinfo{$i--}= "$i|Henion,David|A|Active|010474|HAWKEY,Michael G|S +C3789"; } my $log=0; $i=0; foreach my $en (keys %pfinfo) { logAndSkip(\*LOG,"Considering the entry from PeopleFirst extract: $e +n...") if ($log); # Get the PF information my @pfi=(); #reset the array @pfi=split/\|/,$pfinfo{$en}; # If employee number does not exists in ED, it looks like a creation if (!exists $ed_en{$en}) { createEDentry(\*LOG,\@pfi,\%used_dn,\%en2dn); } # Looks like an ED entry update else { updateEDentry(\*LOG,$en2dn{$en},\@pfi,\%en2dn); } } # End Foreach sub LogAndSkip {} sub createEDentry { my ($LOG,$pfi,$used,$en)=@_; } sub updateEDentry { my ($LOG,$pfi,$en)=@_; }