in reply to finite automata

This will get you started:

#!/usr/bin/perl -w use strict; my $string = 'just another perl hacker'; # generate a lookup hash containing the words in our language # as the keys, we set the values to 1 with the ++ syntax # so as to define the keys which are all we use my %lang; do{ chomp; $lang{$_}++ }for <DATA>; # split the test string on whitespace to give us an # array that will contain all the 'words' where a # word is a character sequence my @bits = split /\s/, $string; # iterate over our word array seeing if they are # defined in our langugue specification for (@bits) { die "Word '$_' not in language!\n" unless defined $lang{$_}; } # if we have not died then all the words are OK print "Success, \$string only contains words in language!\n"; __DATA__ just another finite automaton perl hack

doc

print(s<>ecode?scalar reverse :p)

Replies are listed 'Best First'.
Re: Re: finite automata
by davorg (Chancellor) on Oct 02, 2001 at 16:10 UTC

    I may, of course, be misunderstanding the problem, but it sounds to me like you're not given a complete listing of the language dictionary - simply a set of rules that valid words must obey. In that case pjf's regex-based solution is far more efficient (assuming, of course, that you can represent each of the rules as a regex).

    --
    <http://www.dave.org.uk>

    "The first rule of Perl club is you don't talk about Perl club."

      As you note it does rather depend on what the problem is. Depending on the situation and the complexity of the language a pre-generated hash lookup table will be potentially much faster than a regex solution.

      Consider a simple alphabet that may only contain words in the form: aa ab ac ad .... az. Including the overhead of generating the hash lookup table the hash method is much faster than a comparable regex method as well as being far more flexible.

      use Benchmark; $string = 'aa ab ac ad ae af ' x 10000 . ' ff'; @string = split /\s/, $string; $regex = <<'CODE'; &regex; sub regex { do{return 0 unless /^a[a-z]$/} for @string; return 1; } CODE $hash = <<'CODE'; $hash{$_}++ for aa..az; &hash; sub hash { do{return 0 unless defined $hash{$_}} for @string; return 1; } CODE timethese ( 100, { 'regex' => $regex, 'hash' => $hash } ); __END__ Benchmark: timing 100 iterations of hash, regex... hash: 21 wallclock secs (20.37 usr + 0.00 sys = 20.37 CPU) @ 4 +.91/s (n=100) regex: 45 wallclock secs (45.10 usr + 0.00 sys = 45.10 CPU) @ 2 +.22/s (n=100)

      doc

      print(s??cod??scalar reverse :p)