If speed is your concern, I would recommend using a pre-built regexp for fast filtering of the candidate words, and a slower exact test against a canonical (sorted letters) form.
Tested, working code:
use strict;
use warnings;
my $dict = '/usr/share/dict/words';
# Do not space-separate the word in this version.
my $scrambled_word = shift
or die "Must specify a word\n";
my $scrambled_length = length $scrambled_word;
my $scrambled_sorted = join '', sort split '', $scrambled_word;
my $pattern = qr{
\A
(?:
[$scrambled_word]{$scrambled_length}
)
\n
\z
}x;
open DICT, '<', $dict
or die "Cannot open '$dict': $!";
while (<DICT>) {
next unless /$pattern/o;
chomp;
my $sorted = join '', sort split '', $_;
next unless $sorted eq $scrambled_sorted;
print "$_\n";
}
close DICT or warn;
| [reply] [d/l] |
I did something similar a few years back. I found that the biggest speed boost came from partitioning the dictionary in advance - making a sepearte file for each word length (ie: 1.letter.words, 2.letter.words, 3.letter.words, etc...) so you don't have to loop over words that obviously don't match.
It's obviously not worth it if you're just going to run it once, but if you this is somehting you plan on doing again and again...
Another approach I've had good luck with on a different type of word problem was to divide the dictionary up based on the first character in each word. That wouldn't help in your case, but a varient would...
Assuming your dictionary looks like this...
at
bat
cat
car
tab
...reformat it so it looks like this...
abt bat
abt tab
acr car
act cat
at at
...then when you get a scrabbled word, sort the letters, and look it up (straight substring match, no regex needed) to find all the matches. To decrease the search space, you can partition the dictionary by first letter of the "word" (first letter when the letters are sorted that is) or by the length of the words, or by both.
| [reply] |
Heh, I wrote something like this for the Neopets word scrambler game, just for the heck of it. I'm on a quest to see how many Neopets games I can achieve Grand Master on using Perl scripts. The interesting thing is that you have to produce all words that can be made from the letters, so it's not as simple as just sorting the letters in order and finding a match. Instead, I make sure that the letter counts for the dictionary word are equal to or smaller than the letter counts for the letter set (plus blanks), then sort the result set by size and alphabetically. I get my results quite fast (a fraction of a second usually), and they'd probably be faster if I went to the bother of eliminating the words larger than 6 letters from the dictionary, since the Neopets game always gives you 6. Also, my algorithm is set up to support blanks, which isn't necessary either unless you want to use this for Scrabble.
use strict;
use warnings;
my $letters = 'jboiac';
my ($lcount, $blanks, %lhash, $handle, @matches);
$lcount = length($letters);
$blanks = $letters =~ s/([^a-z])//g;
$lhash{$_}++ for split //, $letters;
open ($handle, 'dict1.dat');
while (<$handle>) {
chomp; push @matches, $_ if scrabble($_);
}
close ($handle);
print join "\n", sort { length($b) <=> length($a) || $a cmp $b } @matc
+hes;
sub scrabble {
return 0 if length($_[0]) > $lcount;
my ($nf, %wlhash);
$wlhash{$_}++ for split //, $_[0];
for (keys %wlhash) {
no warnings;
return 0 if $lhash{$_} < $wlhash{$_} &&
($nf += $wlhash{$_} - $lhash{$_}) > $blanks;
}
return 1;
}
Returns:
jacobi
ciao
abc
bio
boa
cab
cob
jab
jib
job
(abc wouldn't be allowed in Scrabble, but the Neopets game has some odd words in it, including names of Neopets. I had to add these manually as I went along using a second script)
| [reply] [d/l] [select] |
This is a late comment, but I hope not out of place.
My daughter and I were working on a Jumble puzzle with a word we just couldn't crack. So, of course, I decided to throw the Perl wrench at it. As I thought about the algorithm, regular expressions just seemed like a natural choice.
I have to say this solution amazed me (I admit it, I'm not an experienced or professionally educatued programmer). I mean look at: 12 lines! And the solution -- how clever! Rather than looking for a match, it eliminates non-matches. I've learned something here.
OK, so I needed to solve a jumble, not find all possibilities. Two quick changes to one line, and I'm off:
next unless ( length $word == $#ARGV + 2 );
Great post, great solution. Thanks jonnybe | [reply] [d/l] |