Scrabble Word Regex Challenge

I'm always tinkering with Scrabble-like puzzles, and writing scripts to help me solve them.

One basic need is to filter my dictionary file to remove impossible words (or find impossible words), based on the number of tiles for each letter. Assume whatever dictionary structure you like, as well as the tile count list or structure of your choice. Assume blank tiles are a non-alpha character, such as '#'.

Challenge 1: Ignoring blank tiles, write a code snippet to efficiently match possible (or impossible) words in the dictionary.

Challenge 2: Modify #1 for blank tiles.

Bonus Challenge 3: Design a dictionary structure that enables efficient searches for puzzles, while minimizing space. For example, I have 5 tiles in @my_tiles, a given board partial row @this_row, which will have at least 1 tile placed, and possibly more, along with at least 1 blank space. The dictionary structure should make it easy to find all words that will fit in this partial row, using tiles from @my_tiles.

-QM
--
Quantum Mechanics: The dreams stuff is made of

Comment on Scrabble Word Regex Challenge Select or Download Code

Replies are listed 'Best First'.
Re: Scrabble Word Regex Challenge by kyle (Abbot) on Sep 12, 2007 at 16:35 UTC
Here's a start, not heavily tested... Here's how I build my dictionary: `sub word_to_struct { my ( $word ) = @_; my %out; $out{$_}++ for split //, $word; return \%out; } sub read_dict { my ( $dict_file ) = @_; my %out; $dict_file \|\|= '/usr/share/dict/words'; open my $dict_fh, '<', $dict_file or die "Can't read '$dict_file': $!\n"; while ( my $word = <$dict_fh> ) { chomp $word; if ( $word =~ m{ \A [a-z]+ \z }xmsi ) { $out{$word} = word_to_struct($word); } } close $dict_fh or die "close failed: $!\n"; return \%out; }` [download] That produces a data structure like this: `... 'expletive' => { 'l' => 1, 'e' => 3, 'p' => 1, 'v' => 1, 'x' => 1, 'i' => 1, 't' => 1 }, 'measles' => { 'l' => 1, 'e' => 2, 'a' => 1, 'm' => 1, 's' => 2 }, ...` [download] So then I can build a list of possible words this way: sub possible { my ( $tiles_ref, $dict_ref ) = @_; my @out; my %tile_count; $tile_count{$_}++ for @{$tiles_ref}; my $longest_word = scalar @{$tiles_ref}; WORD: foreach my $word ( keys %{$dict_ref} ) { next WORD if length $word > $longest_word; foreach my $letter ( keys %{$dict_ref->{$word}} ) { next WORD if ! defined $tile_count{$letter}; next WORD if $dict_ref->{$word}{$letter} > $tile_count{$le +tter}; } push @out, $word; } return @out; } # example: use Data::Dumper; print Dumper( possible( [ qw( a b c d e ) ], read_dict() ) ); [download] It seems to me that the big `foreach` could be some kind of unholy nested grep, but I haven't done that. Note also that I've rejected the idea of building a regex to match words, and I've made no attempt to make the dictionary compact. Now that I've written this much, I'm wondering if this is a homework problem.	[reply] [d/l] [select]
Re^2: Scrabble Word Regex Challenge by QM (Parson) on Sep 18, 2007 at 14:34 UTC
Nope, it's not homework, it's a hobby. -QM -- Quantum Mechanics: The dreams stuff is made of	[reply]
Re: Scrabble Word Regex Challenge by goibhniu (Hermit) on Sep 12, 2007 at 17:31 UTC
I think the problem for Challenge 1 is discussed in Simple regex wordlist question, with my solution in Re^3: Simple regex wordlist question after a head start by ikegami. I humbly seek wisdom.	[reply]
Re: Scrabble Word Regex Challenge by Anonymous Monk on Sep 24, 2007 at 15:06 UTC
I think that using a regexp to solve this is using a sledgehammer to crack a nut. It can be done easily for problem (1) as long as the regexp packge you use allows for numbered repetition counts, eg "a(7)" meaning 0 to 7 letter a's. You'ld need to presort the word before testing it. The obvious implementation using an array of letter counts will be faster and more elegant. (A regexp solution may well mean shorter source code, but that equates with elegance only if you ignore what it's doing behind the scenes...) I'm reasonably sure that a regexp solution allowing for two blanks will explode in size. By the way on (3) there are two classic solutions that will be hard to beat: a) use a DAWG (Directed Acyclic Word Graph) or depending on space/speed requirements, one of the tweaked versions of same; and b) use an 'anagram dictionary' where your wordlist is keyed by an alpha sort of the letters, eg 'elpr => perl' in combination with a permutation generator. If you can come up with anything new I'll be happy to list the code on the wordgame-programmers web site :-) regards Graham Toal (www.gtoal.com/wordgames/scrabble.html)	[reply]