Anyone here have suggestions for
scrabblehooks (esp. in terms of speed increase) I'm very much a perl n00b and when a friend of mine presented me with this text problem, I thought perl, and struggled through to write it, but it was much slower than hoped for. Any suggestions?
#!/usr/bin/perl
$op = shift @ARGV;
# load a list of valid words
unless ($op =~ /^S$/) {
open WORDLIST, shift @ARGV;
while (<WORDLIST>) {chomp;push @words, $_;}
close WORDLIST;
} else {
while(<>){push @correct;}
}
# find a list of possible words from the list of valid words
if ($op =~ /^P(.{0,2})(.)(.{1,})$/) {($otherop,$direction,$letter)=($1
+,$2,$3);
for (@words) {
if ($direction eq "F") {
if (/\b($letter)/) {push @possible, $';}
} elsif ($direction eq "B") {
if (/($letter)\b/) {push @possible, $`;}
}
}
if (!$otherop) {for(@possible){print "$_\n";}}
}
# check the list of possible words against the list of valid words
if ($op =~ /^(.?)C(S*)/) {$otherop = $2;
if (!$1) {while(<>){push @possible, $_;}}
for $word (@words) {
for(@possible) {
if ($word eq $_) {push @correct, $_;}
}
}
if (!$otherop) {for(@correct){print "$_\n";}}
}
# sort the list of correct words by size (small to large) and then abc
if ($op =~ /^(.*)S/) {
if ($op =~ /S$/) {while(<>){push @correct, $_;}}
@sorted = sort { length $a <=> length $b or $a cmp $b } @correct;
for(@sorted) {print "$_\n";}
}
The problem is this: the input is a list of valid words - all letters of the same case, one word per line. The output is based on finding words which, when a letter is removed from the beginning is still a valid word e.g. a search for a front hook of A in a list containing AAH and AH searching would return AH as AAH - A in front = AH, which is in the list. Expand this to 26 letters and 54 lists and it took a significant amount of time to process a 1.6M list of valid words.
Any speed tips? or is that much processing just going to take a long time? (I split the job up among 4x2GHz computers and it took about 8 hours to generate all of the lists)
Source, inputs and outputs
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.