The original word is read of the command line and its letters are used to create a character class based regex. The words in the dictionary files are processed one by one and matched against the regex. This acts as a filter to exclude any word that is not made up from letters in the original word with an optional 'extra letter'.
Words that pass the first filter are frequency checked to validate them for the overuse of letters before they are output. The frequency checking allows a repeat of one of the original letters for the wildcard.
This worked 'pretty quickly' (TM) against a 200,000 line dictionary.use strict; use warnings; my $data = shift @ARGV; my $regex = qr /^([$data]*)([^$data]?)([$data]*)$/; my %letterfrequency; $letterfrequency{$_}++ foreach split //, $data; OUTER: while (chomp(my $word = <>)) { next unless $word =~ /$regex/i; my %frequency; my $repeat = $2 ? 0 : 1; foreach (split //, $1.$3) { if (++$frequency{$_} > $letterfrequency{$_}) { next OUTER unless $repeat --; } } print "$word\n"; }
In reply to Re: literati cheat / finding words from scrambled letters
by inman
in thread literati cheat / finding words from scrambled letters
by sulfericacid
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |