smgfc has asked for the wisdom of the Perl Monks concerning the following question:

Well, in an attempt to kill my friends in word racer I set out to write a boggle solver. Before I got into the complex shapes used as word racer boards, I decided to figure out how to solve the simple 4X4 boggle board. I initally started by importing a dictionary and doing text preprocessing, but that seemed backward, and decided to concentrate on extracting words from the board first.

The first thing I did was create a sample program with a built in board and no UI or any superflous code. I decided to use a Depth First Search to find the possible strings of characters, and recursion combined with an array of moves to weed out all the possibilites. Essentially all that happens is the dfs sub starts with one character on the board and iterates through array moves, apply each transformation to the current x,y coordinate and branching a new dfs sub with the new coordinates.

With the current implementation there are a number of issues. The most pressing is a error I just don't know how to solve. When the dfs sub assigns whatever was passed to it I get this error: Odd number of elements in hash assignment at dfsboggle.txt line 30.. Since the only hash in the program is visited, and visited hasn't even been touched yet I just don't get it.

Another, more long term issue, has to do with memory. Every time I call dfs after the initial call I create a new visited hash to deal with the different branches and their respective visited nodes/characters. The same thing happens with the word scalar. I suppose, you could just pass a reference and at the end of each branch clear both visited and word, but if the program ever gets updated to do process the different branchs simultaneously then this will break. I dont know how much multi-threading would speed this up, if it is possible, or how to do it so the point is moot

Beyond that, obviously array board isn't passed by reference, but this is a sample program. I plan on keeping the moves in a closure with the dfs sub to avoid having to pass it. If that isnt a good idea I would love to here why, since I havent seen it done before. I haven't been able to test this because of the aforementioned problem, so the dfs sub could have problems that would output wrong answers. Any help or different approaches would be greatly appreciated. Code Below

#!/usr/bin/perl -w use strict; my (%visited); my ($word) = ""; my (@board) = ( ['a','b','c','d'], ['e','f','g','h'], ['i','j','k','l'], ['m','n','o','p'] ); my (@moves) = ( [0, -1], #x + 0, y + -1 [1, -1], [1, 0], [1, 1], [0, 1], [-1, 1], [-1, 0], [-1, 1] ); dfs(0, 0, %visited, $word); sub dfs { my ($x, $y, %visited, $word) = @_; my ($end) = 1; # end is used to determine if there are anymore cha +racters in the string before printing $word $visited{$x . $y}++; $word .= $board[$y][$x]; foreach (@moves) { next if $x + $$_[0] < 0 or $x + $$_[0] > 3 or $y + $$_[1] < 0 or $y + $$_[1] > 3 or exists $visited{ ($x + $$_[0]) . ($y + $$_[1]) }; #make sure the transformed coordinates are valid and ne +w $end = 0; dfs($x + $$_[0], $y + $$_[1], %visited, $word); } print $word if $end == 1; }

Replies are listed 'Best First'.
Re: Depth First Search and Boggle
by BrowserUk (Patriarch) on Aug 15, 2002 at 03:48 UTC

    In line 30 my ($x, $y, %visited, $word) = @_ you are assigning to a hash. If @_ contains 3|5|7...elements, the first two will go into $x, $y and the rest (odd number) will be assigned to %visited and $word will always be undefined.

    You should either put $word before %visited or pass in a hashref.


    What's this about a "crooked mitre"? I'm good at woodwork!
      I understand that, but i never pass 3|5|7... elements to dfs. I either call it

      dfs(0, 0, %visited, $word)

      or

      dfs($x + $$_[0], $y + $$_[1], %visited, $word)

      so there shouldnt be any problem with odd numbers of elements getting passed to dfs. Beyond that passing by reference would create alot more work to maintain seperate visited and work variables for the different branches in the search.

        The point is that all your parameters, including %visited get flattened and passed as a single list:

        ($x, $y, key1,val1,key2,val2,...,$word)

        Inside dfs(), Perl has no way to tell that the last value ($word) is not an element of the list that must be assigned to %visited, so it attempts to assign that too. Hence the odd number of elements.

        $word will never be assigned a value.

        Swap the order of %visited and $word inside dfs() and when you call dfs() and your code will (probably) work!


        What's this about a "crooked mitre"? I'm good at woodwork!
        Here's a hint.
        print "'$_'\n" foreach @_ at the start of your sub (before the assignment).
        See what is happening here? You will get a printout like:
        '0'
        '0'
        ''
        hmm.. not quite what you wanted I think. You are only passing 3 defined values into the sub. Two 0's and one "". Because your hash is undefined, it doesn't show up in the @_ list.

        Just looking at this bit of code, it looks like you are trying to do the "right" thing by use'ing strict and passing your variables around to your subroutine. The thing is, if your variables have no value, then there is no reason to pass them. In fact, looking at it,there is absolutely no reason to define %visited or $word in the "main" code at all

        In this bit of code, I would personally work more like
        ... dfs(0,0); ... sub dfs { my ($x,$y,$word,%visited) = @_; #no error here! ... process code ... dfs($x,$y,$word,%visited); #recursion call }
        See the differences? First, because we only pass 2 defined values in the first call to dfs, we only assign to $x and $y. This is fine, because we don't use $word or %visited before we assign to them. Hence, we aren't using an undef'd variable. Nor are we assigning "" to the hash. Second, in subsequent calls, we pass the hash last

        BrowserUk is completely right in this. Using an assignment like in your code, you will almost always end up with an odd value assignment. You could almost call an assignment to a hash (or an array for that matter) greedy, because it will always take up the "rest" of @_, if you need to pass variables and a hash or an array, always put the hash/array last in the assignment list. If you need to pass two lists (eg 2 arrays, or a hash and an array) you MUST pass by reference.
Re: Depth First Search and Boggle
by Abigail-II (Bishop) on Aug 15, 2002 at 09:26 UTC
    I think your algorithm has some problems. First, all words have to start with the same letter, as you only call dfs with 0, 0 initially, the rest just after adding a character. Second, you only print out a word if you cannot do a move - but of course prefixes of such a string are words too. But the big issue has to do with combinatorial explosion. The number of possible words found this way grows exponentially in the size of the board. It's doable for a 4x4 board, the number of possible words will be less than 2^16, but it will grow rapidly when you increase the size. You should take your dictionary and preprocess it to get all possible prefixes of valid words. And while you create your words while walking the board, stop as soon as your constructed word so far is not a valid prefix. No point in continueing if the first two letters are "QX". ;-)

    Abigail

      Maybe I should have included this in my post. Right now I have a hash of a hash of hash... where each character maps to a key. ex.

      hello equals $hash{h}{e}{l}{l}{o} = {}.

      I planned on walking through this hash as I went through the dfs sub, but wanted to get the sub correct before I made it more complicated. Same thing with the singular call to dfs. I figured it would easier to examine the output if I started from one point. I could have wrapped the initial dfs call inside two for loops, but in the intrest of getting dfs working I decided to start from one point only.

      Update: Here is my preprocess code, if you have any suggestions.

      my ($dict) = "/usr/share/dict/words"; open (FILE, $dict) or die ("Can't open $dict: $!\n"); while (<FILE>) { chomp; next if length() < 3; @char = split //, $_; $pointer = \%{$word_tree{$char[0]}}; for (1 .. $#char) { $pointer = \%{$$pointer{$char[$_]}}; } } close (FILE);
      Another suggestion to improve your algorithm (especially if there is any kind of time limit ...such as your life expectancy :) is to look for shorter words first (breadth first search). No use chugging through permutations of letters trying to find words that start with arteriolosclero when you could move on and find a, as, ass, assai, assail, and assailant. Hope this helps. --Phil
Re: Depth First Search and Boggle
by Abigail-II (Bishop) on Aug 15, 2002 at 16:38 UTC
    It's really simple to write a quick algorithm. I whipped up the program below in about 15 minutes. It's doing the preprocessing of the dictionary for valid prefixes each time - it would be better if you do the preprocessing for prefixes (and valid words) once, using a tied hash and a dbmfile.

    Anyway, here's the code:

    #!/usr/bin/perl use strict; use warnings 'all'; my $dict = "/usr/share/dict/words"; my %prefix; my %valid; my %found; my $board; sub preprocess; sub init_board; sub find_words; preprocess $dict; init_board; foreach my $x (0 .. @$board - 1) { foreach my $y (0 .. @{$board -> [$x]}) { find_words $x, $y, "", []; } } sub preprocess { my $file = shift; open my $fh => $file or die "Failed to open $file: $!\n"; local $_; LINE: while (<$fh>) { chomp; $_ = lc; next if /[^a-z]/; $valid {$_} = 1; while (length) { next LINE if $prefix {$_} ++; chop; } } } sub init_board { local $_; push @$board => [lc =~ /[a-z]/g] while <DATA>; } sub find_words { my ($x, $y, $word, $seen) = @_; return unless $x >= 0 && $y >= 0 && exists $board -> [$x] && exists $board -> [$x] [$y] +&& !$seen -> [$x] [$y]; $word .= $board -> [$x] [$y]; return unless $prefix {$word}; print "$word\n" if $valid {$word} && !$found {$word} ++; $seen -> [$x] [$y] = 1; foreach my $dx (-1 .. 1) { foreach my $dy (-1 .. 1) { find_words $x + $dx, $y + $dy, $word, $seen if $dx || $dy; } } } __DATA__ a b c d e f g h i j k l m n o p q r s t u v w x y
    Running this gives:
    bag
    chide
    fag
    gab
    hi
    hid
    hide
    him
    hint
    in
    mid
    no
    to
    ton
    tons
    up
    
    It finds the words in just over a second and a half. And a small test shows it will work on non rectangular boards as well. And it's real easy to make it work on a torus.

    Abigail

      It would be even better if there was a way to make it do plurals.
        Either use a dictionary that already has plurals in it, or use the following preprocess function:
        use Lingua::EN::Inflect qw /PL/; sub preprocess { my $file = shift; open my $fh => $file or die "Failed to open $file: $!\n"; local $_; LINE: while (<$fh>) { chomp; $_ = lc; next if /[^a-z]/; WORD: foreach ($_, PL ($_)) { $valid {$_} = 1; while (length) { next WORD if $prefix {$_} ++; chop; } } } }
        Abigail

      Note to anyone testing this algorithm out, it works well except for a small bug.

      Because the $seen matrix is being passed into sub-calls by reference, any changes to it will still be there when the recursive find_words call returns. That messes up further calls to find_words because the seen-matrix is lying, in a sense.

      To fix it, just add this line as the last line of the find_words sub (outside of the two for-loops), so that the seen-matrix is properly reset for a given call's ancestors:

      $seen->[ $x ][ $y ] = 0;

Re: Depth First Search and Boggle
by cLive ;-) (Prior) on Aug 15, 2002 at 04:38 UTC
    You've been hanging out on worldwinner.com, haven't you!

    LoL

    cLive ;-)

    --
    seek(JOB,$$LA,0);