This post continues with the earlier post in which I determined which cities were right next to a particular city. Concretely, we determined that palo alto and atherton were adjacent to menlo park.

This post extends that post in 2 ways. First, we create a slightly harder problem so that we can see some more XML::Twig code. Then I engage in a long-winded tangent about Perl and problem-solving engines.

Now we want to find which cities are 2-away from menlo park. Or put another way, those which are adjacent to palo alto and atherton. In our case, we want to omit backtracking on our path. So even though menlo park is in fact 2-away from menlo park (just go to atherton and come back), we will omit these cases.

Here is a program which solves this problem:

use XML::Twig;

my $t = XML::Twig->new(PrettyPrint => 'record');

$t->parsefile('adj.xml');
my $root = $t->root;

my @pair = $root->children;

my $city = 'menlo park';

sub candidate_generator {
    my ($search_text, @data) = @_;
    grep { grep { $_->text eq $search_text } $_->children  } @data;
}

my @adj = candidate_generator($city,@pair);

my @next_city = map { 
    $_->first_child->text ne $city 
    ? $_->first_child->text : $_->last_child->text } @adj;

print "cities next to $city == @next_city";

# -------- no changes up to here. Here is the new stuff:

# now we use a slightly different map and grep
my @next_to_next = 
    grep { $_->first_child->text ne $city and 
           $_->last_child->text ne $city }
    map { candidate_generator($_, @pair) } @next_city;

map { $_->print } @next_to_next;
[download]

If you followed the first program, then this one should really require no explanation other than an analysis of the efficiency of our means of calculating 2-away cities:

my @next_to_next = 
    grep { $_->first_child->text ne $city and 
           $_->last_child->text ne $city }
    map { candidate_generator($_, @pair) } @next_city;
[download]

Because we map() then grep(), we are doing what is known as generate-and-test: we create a bunch of candidates and then later filter them out based on a criteria.

a new Twig API function: delete()

But since generate and test is wasteful, let's take at another more efficient way of writing the source code starting from the comment Here is the new stuff:

$_->delete for (@adj);

my @next_to_next = 
    map { candidate_generator($_, @pair) } @next_city;

map { $_->print } @next_to_next;
[download]

So here we prune the XML-tree of the records which were 1-away from our target city and then our candidate generator can only generate the correct candidates.

But actually this change in the program is more than just efficient. We can simply run this loop N times when we want to find the cities which are N+1 hops away.

But this is the beauty of using this approach, we can cut away the search space with a single call to delete(). We could also do all sorts of other funky things with our tree just as easily...

Expert systems: A Completely Different Idea on Solving this Problem

note: there is some more Perl from this point out, but no more on XML::Twig so you can quit reading now if long-winded philosophy on problem solving and software engineering is not your bag.

So far, I have tackled this problem the Perl way - I dove right in and didn't quit coding until I was done. However, I would like to propose another way to handle this problem, and that way is what is known as forward-chaining. A very old, very stable forward-chaining rules-based system is CLIPS.

Forward chaining is where you start with a set of facts and then you build a bunch of rules which fire if their preconditions meet some of the facts on the stack. Typically when these rules fire, they assert new facts and retract some old ones. And then some other rules notice these new asserted facts on the stack and then go through the same assert/retract cycle until finally some rule notices that you have reached your goal and announces it and terminates the program.

This is opposite of a backward-chaining system like Prolog, where you start with your goal and attempt to find rules which match the goal and search for facts with satisfy these rules.

Architecture of CLIPS

It seems that for any problem domain, decoupling this from that seems to provide great benefits at some point. For CLIPS, time has shown that decoupling a problem solver into the following things works well:

fact database. Facts are statements about the world. In this sample Perl program program the facts are spread all over the place. The program started with all facts in an XML database. Then I ran the candidate generator and then had a set of facts about cities 1-away from menlo park. But these were not in an XML database, but instead in an array called @adj then I ran a couple of more maps and greps and had some more facts stored in another array called @next_to_next In a forward chaining rule system such as CLIPS there is a centralized and limited way to change and access the state of world and that is through (assert), (retract) and (facts)
the rule database - a rule is something which fires when it notices that certain facts are true. For example in a weather system, you would have rules to recognize humidity, air pressure, and so forth. In a human resource expert system you would have rules to recognize credit ratings, grade point average and so forth. This might be better known as a reactive system. Again, CLIPS has a very fixed way to creating rules, using a (defrule) construct. In my program my rules were never explicity mentioned, they were in candidate_generator() and the two map and grep statements.
And oftentimes in Perl programs, the guiding logic is not necessarily extracted out and placed somewhere. Instead you might see a policy for a system in an if-then somewhere:
```
if ($file !~ /p(\d{6,7})_(\w+)_(\w+).zip/) {
  then die "$file violates the companies syntax for patch filenames";
}
[download]
```
However, as mentioned in the book "Data Munging with Perl" by Dave Cross, it is much better to place your business logic in a module to facilitate re-use and also so that all system logic can be edited in a single place:
```
System::Logic::valid_patchfile(File => $file, DieOnErr => 1);
[download]
```
The rule selection engine. The rule selector is quite complex in advanced systems such as CLIPS because after awhile, in any complex rules system, there will be conflicts: two rules both match the current fact base but only one can fire on the next step. CLIPS has about 6 or 7 conflict resolution strategies. Some will choose a rule based on whether the facts it used are older than another. Others allow rules to be assigned a saliency and the selector will choose the rule with the highest saliency.

The big win with this type of program is that you don't have to run map() or grep() as I have been doing but instead simply code up a bunch of program-state-recognizers. Also, you never have to write the rule selector. But all in all, until something like that shows its necessity I will just keep solving problems in the most intuitive way possible.

Conclusion

I think it is high-time for me to quit trying to integrate rules-based programming into Perl. Rules systems have means of making statements about objects. Perl has means of creating objects which can be used in a call-by-contract fashion for the same result but simply a different means of expressing it.

Edit: chipmunk 2001-08-20

Comment on Adjacency List Processing Continued + Expert System Discussion Select or Download Code

Replies are listed 'Best First'.
Re: (don't stop) Adjacency List Processing Continued + Expert System Discussion by mrmick (Curate) on Aug 20, 2001 at 18:02 UTC
I think it is high-time for me to quit trying to integrate rules-based programming into Perl. Please don't stop! It's always good to see a different, although according to Perl it may be unconventional, way of solving problems. Besides, Perl programmers have typically been unconventional. :-) ...waiting for part 3 ... Mick	[reply]