Improving Evolutionary Algorithm

I was reading A Beginning Guide To Evolutionary Algorithms the other day, and thought it would make a nice method to solve a oft-quoted puzzle problem.

The Problem:

Imagine a sentence that goes like this: "This sentence containes 1 "0", 3 "1"s, 1 "2", and 3 "3"s." After counting it up, lo and behold! The sentence actually does contain one zero, three ones, one two, and three threes.

The problem then is: can you solve the analogous problem for all 10 numbers? (e.g. This sentence contains _ "0"'s, _ "1"'s ... _ "9"'s). Evolutionary arguments seemed to work very well for this sort of thing, as breeding 2 solutions seems to work as real breeding does, that is, it is somewhere in the middle of the 2 parents on the "fitness" scale.

Most of my code is modeled after the sample code in the node, but I find it to be rather ineffecient. I'm wondering if my fellow monks could critique it on efficiency and general programming style/structure. Thanks!

The Code:

use strict;
use List::Util qw(shuffle);
use Memoize;

$|++;
memoize('fitness');

my $pop_size = 100;
my $string_size = @ARGV ? pop : 9;
my $gen_size = (2 * int ($pop_size / 10)) || 2;
my $generations = 1000;
my $current_generation;
my @parents;
my @children;
my @population = sort { fitness($b) <=> fitness($a) }
                 map  { randlist() } 1 .. $pop_size;

while ( $current_generation++ < $generations 
        and fitness($population[-1]) ) {
    @parents = @population[ select_parents() ];
    
    push @children, crossover( splice(@parents, 0, 2) ) 
          while @parents;
    @population[ select_children() ] = map { mutate($_) } @children;

    @population = sort { fitness($b) <=> fitness($a) } @population;

    cataclysm(\@population) if (rand() < .005);
    printf "\r%*s/%d : %3s" => 
    length $generations, $current_generation, 
    $generations, fitness($population[-1]); 
}

print "\n$population[-1]";
print "\nEND\n";

############

sub randlist {
    my $gene;
    for (0 .. $string_size) {
        $gene .= int rand ($string_size + 1);
    }
    return $gene;
}

sub fitness {
    local $_ = shift;
    my $diff;
    my $count;
    for my $n (0 .. $string_size) {
       $count  = eval "tr/$n/$n/";
       $diff += abs(substr($_,$n,1) - $count - 1);
    }
    return $diff;
}

sub select_children {
    my $total = $pop_size * ($pop_size + 1) / 2;
    my $rand;
    my @children;
    for (1 .. $gen_size) {
        $rand = int rand $total;
        for my $n (reverse(1 .. $pop_size)) {
             if ( ($rand -= $n) < 0 ) {
                 push @children, -$n;
                 last;
             }
        }
    }
    return @children;
}

 sub select_parents {
     my @parents;
     my @children = select_children();
     $parents[$_] = -$children[$_] - 1 for (0 .. $#children);
     return @parents;
}

sub crossover {
    my ($g1, $g2) = @_;
    my ($start, $end) = sort { $a <=> $b } 
                        map { int rand ($string_size + 1) } 
                        1 .. 2;
    (substr( $g1, $start, $end - $start), 
       substr( $g2, $start, $end - $start )) = 
    (substr( $g2, $start, $end - $start), 
       substr ( $g1, $start, $end - $start )); 
    return ($g1, $g2);
}

sub mutate {
    my $gene = shift;
    for my $n ( 0 .. $string_size ) {
        substr( $gene, $n, 1 ) += (int rand 3) - 1 
          if (substr( $gene, $n, 1 ) < $string_size 
          and rand() < .3);
        substr( $gene, $n, 1 ) = 1 if (substr( $gene, $n, 1) < 0);
    }
    return $gene;
}
 
sub cataclysm {
    my $population = shift;
    my @children;
    my @random;
    print "\rCATACLYSM!";
    for (1 .. 3 * ($pop_size / $gen_size)) {
        push @random, randlist() for (1 .. $gen_size);
        @$population[ select_children() ] = @random;
        @$population = sort { fitness($b) <=> fitness($a) } 
                           @$population
    }
}
[download]

Code is (almost) always untested.
http://www.justicepoetic.net/

Comment on Improving Evolutionary Algorithm Download Code

Replies are listed 'Best First'.
Re: Improving Evolutionary Algorithm (pangram) by grinder (Bishop) on Feb 19, 2004 at 19:30 UTC
These puzzles are called pangrams. Douglas Hofstadter wrote about them in his Metamagical Themas column in Scientific American in the early 80s. Someone came up with an algorithm to solve pangrams, and it involves getting the values right letter by letter. First you plug in any old value for evey letter ("This sentence contains one 'a', one 'b'... and one 'z'). The you adjust it so that the 'a' count is correct. Then you adjust it so that the 'a' and 'b' are correct, and so on, until you reach 'z'. Maybe you could adjust your measure of fitness to take this into account. Other than that I don't have any remarks on your code, which looks pretty fine to me. The only thing I'd suggest is that .3 for mutation is very high.	[reply]
Re: Re: Improving Evolutionary Algorithm (pangram) by jweed (Chaplain) on Feb 19, 2004 at 19:47 UTC
That's the weirdest thing - I was just reading Metamagical Themas and I didn't even see them. Ah well, I think he might have put them in GEB as well. In any case, techincally the chance of mutation is .2, since once you "muatate," the number has either -1, 0, or 1 added to it. .3 * 2/3 = .2 Code is (almost) always untested. http://www.justicepoetic.net/	[reply]
Re: Improving Evolutionary Algorithm by delirium (Chaplain) on Feb 19, 2004 at 19:59 UTC
Woo-hoo! More Hofstadter fans! Strangely, the 9 10 digit problem is solvable without using genetic algorithms or any randomization functions. Here's a solution that just starts with each count at 0 and feeds the last counts into the next sentence, evaluates, and repeats until the sentence is truthful: `#!/usr/bin/perl use strict; use warnings; my @list = 0..9; my %hash; $hash{$_} = 0 for @list; my $valid = 0; my $test = ''; while (!$valid) { $valid = 1; $test = 'This sentence contains '; $test .= "$hash{$_} ${_}s, " for @list; $test .= '. - '; print $test; my %test_hash; for (@list) { $test_hash{$_} = eval("\$test =~ tr/$_//"); $valid = 0 if ($hash{$_} != $test_hash{$_}); $hash{$_} = $test_hash{$_}; } ($valid == 1) ? print "yes\n" : print "no\n"; }` [download] I won't spoil this by printing the answer, but it only took 4 iterations to achieve.	[reply] [d/l]
Re: Improving Evolutionary Algorithm by blokhead (Monsignor) on Feb 19, 2004 at 20:09 UTC
I'm glad you liked my EA writeup. This sounds like an interesting problem for EAs, and what you've got looks good. I see you've introduced a cataclysm to reset part of the population. You have it triggered at random, but usually such an event is triggered when fitness is levelling off and becoming very uniform. You seem to have a fine grasp of things, I don't have any specific code comments, but one thing I can suggest is to use some ready-made EA framework instead of implementing all the nitty-gritty population-munging stuff yourself. (shameless plug) A module I wrote called Algorithm::Evolve does this and works pretty well for simple EA fun like this (though I haven't done much development on it lately). Here's what this problem would look like using an ArrayEvolver base class from the examples/ directory: I did have to make a few modifications to your problem like maximizing the fitness measure, since the framework needs to maximize fitness. Read more... (2 kB) The benefit of this is that when the population-munging stuff is factored out, you can now easily modify the parameters of the EA: number of parents, population size, selection & replacement methods. It makes it easier to find combinations that work. This also seems to run a lot faster than yours, though I don't know exactly where the slowdown is in your code. You've at least memoized the fitness function, which is a common bottleneck. As for this specific problem, I don't know as much about it as others who've responded. It might not be stable enough for an EA, or there might be a more direct and efficient way to find solutions -- I wasn't able to reach a solution for N=10 after many attempts. Maybe a huge population and very small fitness bias would help. Or just adding +1/-1 for mutations, instead of just picking a brand new number at a position. Plus, are you even sure that a solution exists for the N=10 case? That would also be worth exploring, before you get discouraged with EAs ;) Finally, to balance out my shameless plug, I try to keep a list of Perl EA resources in my homenode, so if you don't like Algorithm::Evolve, there are several other choices worth looking into. blokhead	[reply] [d/l]
Or AI::GP by jmerelo (Sexton) on Nov 23, 2008 at 18:33 UTC
Which is available from sourceforge	[reply]
Re^2: Improving Evolutionary Algorithm by jmerelo (Sexton) on Nov 23, 2008 at 18:36 UTC
Like Algorithm::Evolutionary, with new version (and paper) coming soon.	[reply]
Re: Improving Evolutionary Algorithm by tilly (Archbishop) on Feb 20, 2004 at 19:27 UTC
See also Solving Meta Sentences. (Coincidentally seen in Selected Best Nodes.)	[reply]
Re: Improving Evolutionary Algorithm (no code?(spoiler)) by BrowserUk (Patriarch) on Feb 20, 2004 at 21:55 UTC
I sat down to try and write a program to solve this and ended up with this: `Starting with ? "0" ? "1" ? "2" ? "3" ? "4" ? "5" ? "6" ? "7" ? "8" ? "9" All the counts must be at least 1 1 "0" 1 "1" 1 "2" 1 "3" 1 "4" 1 "5" 1 "6" 1 "7" 1 "8" 1 "9" which makes the 1-count 11 1 "0" 11 "1" 1 "2" 1 "3" 1 "4" 1 "5" 1 "6" 1 "7" 1 "8" 1 "9" but that means an extra 1 so 1-count becomes 12, which makes the 2-count becomes 2 (which would then become 3 except that we have lost a 1) and the 1-count drops back to 11 1 "0" 11 "1" 2 "2" 1 "3" 1 "4" 1 "5" 1 "6" 1 "7" 1 "8" 1 "9"` [download] And I have a solution. Is it the only one? Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail Timing (and a little luck) are everything!	[reply] [d/l]
Re: Re: Improving Evolutionary Algorithm (no code?(spoiler)) by jweed (Chaplain) on Feb 21, 2004 at 04:08 UTC
Hofstadter's Metamagical Themas contends that there are 2 solutions. Oddly enough, I came across a similar problem in a Lewis Carrol puzzle book. It was phrased as follows: `_____________________ \|0\|1\|2\|3\|4\|5\|6\|7\|8\|9\| --------------------- \|_\|_\|_\|_\|_\|_\|_\|_\|_\|_\|` [download] The trick was to self-document the bottom row with one digit in each box. That is, the top row wasn't included in the count. The reason I bring this up is that your use of 12 seemed to rub me the wrong way (I like the idea of just single digits in each case). Whether such a solution exists for the Hofstadter variant (that is, where the top row is counted as well) is conditional on one of our programs finding the other solution. Code is (almost) always untested. http://www.justicepoetic.net/	[reply] [d/l]