in reply to Mysterious slow down with large data set

Not related to speed, but if you had warnings enabled you would get this message:

Name "main::startWord" used only once: possible typo at yourprogram line 42.

Which means that this line:

     42     $thisWord = $now - $startWord;

Is actually processed as:

     42     $thisWord = $now;

Also, Perl is not C, so this line:

     38   printf("$at\t$w1\t$totalsim\t$maxsim\t$topXtotal\n");

Should be:

     38   print "$at\t$w1\t$totalsim\t$maxsim\t$topXtotal\n";

And these lines:

     45     printf(STDERR "#$at\t$w1\t$totalsim\t$maxsim\t$topXtotal\t".
     46        "ELAPSED %.3f THISWORD %.3f PERWORD %.3f HOURStoGO %.3f\n",
     47        $elapsed, $thisWord, $perWord, $hoursRemaining);

Should be:

     45     print STDERR "#$at\t$w1\t$totalsim\t$maxsim\t$topXtotal\t";
     46     printf STDERR "ELAPSED %.3f THISWORD %.3f PERWORD %.3f HOURStoGO %.3f\n",
     47        $elapsed, $thisWord, $perWord, $hoursRemaining;