Boggle again

A while ago I posted the beginnings of a boggle solver to SoPW. It used depth first search and recursion to go through the board finding all the possible strings. After some discussion, abigail-II posted a working boggle solver that used the depth first search and some preprocessing to find the words in the board. After a brief perl sabbatical, I went back through my old posts and decided to redo some of them. This is part of that effort.

As opposed to dfs, this use an iterative approached based on the location of characters, and saves time by finding words while reading from the dictionary. First it splits $input into @data and pushes those values into @board, which is really a copy of one of the boards in %layout. While it does this it is creating %letter_position, where the key is the character and the value is an array holding the positions where that character can be found. Then the dictionary is opened, the word is validated and split into @word, and the search for the word begins.

First you start a for loop that goes through the index of @word; inside this loop you foreach through the co-ordinates of the character $word[$index] (foreach (@{ $letter_position{$word[$index]} }) {). At the beginning of that loop you add that position ($x_one, $y_one) to %path to prevent the reuse of a character later . Inside that loop, you begin another foreach through the co-ordinates of the next character $word[$index+1]. If the second position ($x_two, $y_two) isnt touching the first ($x_one, $y_one) it is added to %invalid for that particular index i.e. the character in that position cant possibly be used to represent $word[$index+1]. If that position is touching the first, you jump to the next index.

Wow that is a bad explanation. Running this with the debugger will give you a better idea of how it works. Here is an example of how it finds the word fink in a board:


board:
    0 1 2 3
    -------
  0|a b c i
  1|e f i h
  2|i j k l
  3|m n o p

word:
  fink

index: 1
  letter_one: f
  x_one: 1
  y_one: 1
    letter_two: i
    x_two: 3
    y_two: 0
      letter_two not touching letter_one
      3 . 0 added to %invalid for index 2
      next letter_two
    letter_two: i
    x_two: 2
    y_two: 1
      letter_two touching letter_one
      index++
      next index

index: 2
  letter_one: i
  x_one: 3
  y_one: 0
    3 . 0 invalid for index 2
  x_one: 2
  y_one: 1
    letter_two: n
    x_two: 1
    y_two: 3
      letter_two not touching letter_one
      1 . 3 added to %invalid for index 3
      next letter_two
    no more n's
    2 . 1 added to invalid for index 2
    index--
    next index

index: 1
  letter_one: f
  x_one: 1
  y_one: 1
  letter_two: i
    x_two: 3
    y_two: 0
      3 . 0 invalid for index 2
    x_two: 2
    y_two: 1
      2 . 1 invalid for inde 2
    x_two: 0
    y_two: 2
      letter_two touching letter_one
      index++
      next index

index: 2
  letter_one: i
  x_one: 3
  y_one: 0
    3 . 0 invalid at index 2
  x_one: 2
  y_one: 1
    2 . 1 invalid at index 2
  x_one: 0
  y_one: 2
    letter_two: n
    x_two: 1
    y_two: 3
      letter_two touching letter_one
      index++
      next index

index: 4
  letter_one: n
  x_one: 1
  y_one: 3
    letter_two: k
    x_two: 2
    y_two: 2
      letter_two touching letter_one
      index++
      next index

This example skips over some of the little details, and doesnt explain exactly how it is implemented, but it should be enough to understand the code below. There are some places for performance improvement. When you validate the the word from the dictionary you could make sure that $input contains the number of letters in the word i.e. skip the word if the input has only one a and the word is arab. Also, you could create a hash of invalid prefixs as you find invalid words and skip those words i.e. if you cant find child, there is no reason to search for childhood. This isn't yet finished code (no way to chose layout or provide input from the command line), but the core of the program is there. Anyway... here is the code:

#!/usr/bin/perl -w

use strict;

#####
# Layouts of boggle/word racer boards
#####

my (%layout) = (
    "word_racer_1" => [
                        [1,1,1,1],
                        [1,1,1,1],
                        [1,1,1,1],
                        [1,1,1,1]
                      ],
    "word_racer_2" => [
                        [0,0,1,1,0,0],
                        [0,1,1,1,1,0],
                        [1,1,1,1,1,1],
                        [1,1,1,1,1,1],
                        [0,1,1,1,1,0],
                        [0,0,1,1,0,0]
                      ],
    "word_racer_3" => [
                        [1,1,1,1,0,0],
                        [1,1,1,1,0,0],
                        [1,1,1,1,1,1],
                        [1,1,1,1,1,1],
                        [0,0,1,1,1,1],
                        [0,0,1,1,1,1]
                      ],
    "word_racer_4" => [ 
                        [1,1,1,1,1,1],
                        [1,1,1,1,1,1],
                        [1,1,0,0,1,1],
                        [1,1,0,0,1,1],
                        [1,1,1,1,1,1],
                        [1,1,1,1,1,1]
                      ]
);

#####
# get/check configuration
#####

my ($sqrt_2) = sqrt(2);

my ($min_length) = 3;

my ($which_layout) = "word_racer_1";

die "No such layout $which_layout\n" unless exists $layout{ $which_lay
+out };

my (@board) = @{$layout{ $which_layout }};

my ($input) = "tnlehrayvsrtjesn";

die "Malformed input\n" if $input =~ /[^a-z]/;

my (@data) = split //, $input;

#####
# Initialize board
#####
my ($x, $y, %letter_position);

for ($y = 0; $y <= $#board; $y++) {
    for ($x = 0; $x <= $#{$board[$y]}; $x++) {
        die "Data smaller then layout\n" if $#data == -1;
        next if $board[$y][$x] == 0;

        $board[$y][$x] = shift @data;

        push @{ $letter_position{ $board[$y][$x] } }, [$x, $y];
    }
}
die "Data longer then layout\n" unless $#data == -1;

#####
# Open the dictionary
#####

my ($file_dict) = "/usr/share/dict/words_processed";

open (DICT, "<", $file_dict)
    or die "Can't open $file_dict: $!\n";

#####
# Check wether the word is in the board
#####

my (@word, $index, %invalid, %notouch, %path);


while (<DICT>) {
    chomp;
    $_ = lc;
    next if length() < $min_length or /[^$input]/;
    @word = split //;

    %invalid = ();
    %path = ();

    INDEX: for ($index = 0; $index <= $#word; $index++) {

        if ($index == $#word) {
            print join '', @word;
            print "\n";
            last;
        }

        foreach (@{ $letter_position{$word[$index]} }) {
            my ($x_one, $y_one) = @{$_};
            last if exists $invalid{$index}{$x_one . $y_one};
            $path{$x_one . $y_one} = 1;

            foreach (@{ $letter_position{$word[$index+1]} }) {
               my ($x_two, $y_two) = @{$_};

               next if exists $notouch{$x_one . $y_one . $x_two . $y_t
+wo};
               next if exists $invalid{$index+1}{$x_two . $y_two};
               next if exists $path{$x_two . $y_two};

               my ($dist) = sqrt( (($x_one - $x_two)**2) + (($y_one - 
+$y_two)**2) );

               unless ($dist == 1 or $dist == $sqrt_2) {
                   $notouch{$x_one . $y_one . $x_two . $y_two} = 1;
                   next;
               }
               next INDEX;
            }

            $invalid{$index}{$x_one . $y_one} = 1;
        }
        last;
    }
}

#####
# Cleanup
#####
close (DICT);
[download]

Update 2/24/02 21:22: escaped html entities and changed mechanism used to tell if letter_one and letter_two are touching from exists $moves{($x_one - $x_two) . ($y_one - $y_two)} to calculating the distance between the points.
Update 2/24/02 22:50: There is a problem that is either caused by the last update or the program in general. It is missing a couple words on some boards, and adding a ton of words that arent there. Hopefully it will be fixed soon.
Update 2/25/02 01:25: I fixed part of the program, and cleaned up the code, but there are still some kinks to work out, and I havent updated the example above.

Comment on Boggle again Download Code