in reply to Efficiency of a Hash with 1 Million Entries

Thanks again for the help. By turning friendoffriend.pl into a module, the execution time is now about 2 minutes! I had no idea that running another script through a system call would cause such a slow down. If anyone is looking for a really simple and easy to understand tutorial on making a Perl module, here is the site I used: http://www.webreference.com/programming/perl/modules/ The only thing now, though, is the results that I was getting before differ from the results I am getting now (about 224,000 clusters previously, now only 124,000 clusters). So I will have to go back through my code and see what could have caused this change.

In regard to the AllJoinRecip table, I must have incorrectly remembered the amount of time it takes to make that table. It actually only takes 10 seconds. I have no idea which table I remember taking 5 minutes to create. I am pretty sure that the AllJoinRecip table follows the suggestions you made, dineed, but I will double check to make sure. Thanks again!

  • Comment on Re: Efficiency of a Hash with 1 Million Entries

Replies are listed 'Best First'.
Re^2: Efficiency of a Hash with 1 Million Entries
by Corion (Patriarch) on Jul 02, 2010 at 22:15 UTC

    Note that since the conversion from an external program to a module within your other program, global variables of friendoffriend.pl retain their value between runs. You might want to reinitialize them upon each run.

      Thanks for pointing this out.

      For those keeping score: It turns out that my original results file had some duplicate entries somehow. I must have messed up somewhere in the script I guess. The new results do not have any duplicates, so it is a much more accurate representation of my data.