Re: Efficiency of a Hash with 1 Million Entries

Thanks again for the help. By turning friendoffriend.pl into a module, the execution time is now about 2 minutes! I had no idea that running another script through a system call would cause such a slow down. If anyone is looking for a really simple and easy to understand tutorial on making a Perl module, here is the site I used: http://www.webreference.com/programming/perl/modules/ The only thing now, though, is the results that I was getting before differ from the results I am getting now (about 224,000 clusters previously, now only 124,000 clusters). So I will have to go back through my code and see what could have caused this change.

In regard to the AllJoinRecip table, I must have incorrectly remembered the amount of time it takes to make that table. It actually only takes 10 seconds. I have no idea which table I remember taking 5 minutes to create. I am pretty sure that the AllJoinRecip table follows the suggestions you made, dineed, but I will double check to make sure. Thanks again!

Comment on Re: Efficiency of a Hash with 1 Million Entries

Replies are listed 'Best First'.
Re^2: Efficiency of a Hash with 1 Million Entries by Corion (Patriarch) on Jul 02, 2010 at 22:15 UTC
Note that since the conversion from an external program to a module within your other program, global variables of `friendoffriend.pl` retain their value between runs. You might want to reinitialize them upon each run.	[reply] [d/l]
Re^3: Efficiency of a Hash with 1 Million Entries by gunr (Novice) on Jul 06, 2010 at 14:51 UTC
Thanks for pointing this out. For those keeping score: It turns out that my original results file had some duplicate entries somehow. I must have messed up somewhere in the script I guess. The new results do not have any duplicates, so it is a much more accurate representation of my data.	[reply]