Anyone here have suggestions for scrabblehooks (esp. in terms of speed increase) I'm very much a perl n00b and when a friend of mine presented me with this text problem, I thought perl, and struggled through to write it, but it was much slower than hoped for. Any suggestions?

#!/usr/bin/perl $op = shift @ARGV; # load a list of valid words unless ($op =~ /^S$/) { open WORDLIST, shift @ARGV; while (<WORDLIST>) {chomp;push @words, $_;} close WORDLIST; } else { while(<>){push @correct;} } # find a list of possible words from the list of valid words if ($op =~ /^P(.{0,2})(.)(.{1,})$/) {($otherop,$direction,$letter)=($1 +,$2,$3); for (@words) { if ($direction eq "F") { if (/\b($letter)/) {push @possible, $';} } elsif ($direction eq "B") { if (/($letter)\b/) {push @possible, $`;} } } if (!$otherop) {for(@possible){print "$_\n";}} } # check the list of possible words against the list of valid words if ($op =~ /^(.?)C(S*)/) {$otherop = $2; if (!$1) {while(<>){push @possible, $_;}} for $word (@words) { for(@possible) { if ($word eq $_) {push @correct, $_;} } } if (!$otherop) {for(@correct){print "$_\n";}} } # sort the list of correct words by size (small to large) and then abc if ($op =~ /^(.*)S/) { if ($op =~ /S$/) {while(<>){push @correct, $_;}} @sorted = sort { length $a <=> length $b or $a cmp $b } @correct; for(@sorted) {print "$_\n";} }


The problem is this: the input is a list of valid words - all letters of the same case, one word per line. The output is based on finding words which, when a letter is removed from the beginning is still a valid word e.g. a search for a front hook of A in a list containing AAH and AH searching would return AH as AAH - A in front = AH, which is in the list. Expand this to 26 letters and 54 lists and it took a significant amount of time to process a 1.6M list of valid words.

Any speed tips? or is that much processing just going to take a long time? (I split the job up among 4x2GHz computers and it took about 8 hours to generate all of the lists)

Source, inputs and outputs

In reply to speedier text matching by lemnar

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.