Alright, this is my first attempt at a genetic algorithm written in Perl. Basically what this does is uses a genetic approach at finding a given string ($thestring).

The program accomplishes this by creating an initial population of strings filled with random characters. It then goes through a series of mutation cycles. Right now the only mutation is the char_replace function, that changes random characters that don't match the target string. I'm in the process of writing other mutation functions, like combining two parent strings to form a child.

The code is pretty rough around the edges and there is still a lot of work to be done. I thought I'd throw it out to the monks for some input. Go easy on me, since this is one of my first Perl programs. =]

#! /usr/bin/perl use strict; use Benchmark; my ($thestring, $popsize, $str_len, @chr, $i, $str, @init_pop, @init_p +op_fitness, $max_fit); # GA To match the string: $thestring = "the quick brown dog jumped over the lazy fox"; # step one, create an inital population of strings the same size $popsize = 1000; $str_len = length($thestring); # put list of valid characters into @chr @chr = qw(a b c d e f g h i j k l m n o p q r s t u v w x y z); @chr = (@chr, " "); my $best_match = 0; # fill each string with random character data for ($i = 0; $i < $popsize; $i++) { $str = join '', map $chr[rand @chr], 1..$str_len; # store the string and it's fitness $init_pop[$i][0] = $str; $init_pop[$i][1] = fitness($str); # keep track of the best match for later if ($init_pop[$i][1] > $max_fit) { $best_match = $str; $max_fit = $init_pop[$i][1]; } } # now, sort the population by fitness (desc) @init_pop = sort { $b->[1] <=> $a->[1] } @init_pop; # STEP 2 - MUTATION my $keep_percent = .25; # percent of population to keep $popsize = $popsize * $keep_percent; my $num_cycles = $str_len * 4; # number of mutation cycles to r +un my $j; my $t0 = new Benchmark; for ($i = 0; $i < $num_cycles; $i++) { for ($j = 0; $j < $popsize; $j++) { $init_pop[$j][0] = char_replace($init_pop[$j][0]); $init_pop[$j][1] = fitness($init_pop[$j][0]); } } my $t1 = new Benchmark; my $td = timediff($t1, $t0); @init_pop = sort { $b->[1] <=> $a->[1] } @init_pop; print("\nCompleted ", $num_cycles, " mutation cycles.\n"); print("Elapsed mutation time: ", timestr($td), "\n"); # count exact matches my $total_matches = 0; for ($i = 0; $i < $popsize; $i++) { if ($init_pop[$i][1] == $str_len) { $total_matches++; } } print("Total Matches: ", $total_matches, "\n"); # function to calculate the fitness of a given string, # that is how many characters match $thestring sub fitness { my $test_str = $_[0]; my ($j, $fit); $fit = 0; for ($j = 0; $j < $str_len; $j++) { if (substr($test_str, $j, 1) eq substr($thestring, $j, 1)) { + $fit++; } } return $fit; } # function that replaces a random number of bad characters with new on +es sub char_replace { my $in = $_[0]; my $fit = fitness($in); my $percent_to_change = rand(); my $i; for ($i = 0; $i < $str_len; $i++) { if ((rand() < $percent_to_change) && (substr($in, $i, 1) ne su +bstr($thestring, $i, 1))) { substr($in, $i, 1) = join '', map $chr[rand @chr], 1; } } return $in; }

Replies are listed 'Best First'.
Re: Genetic Algorithms
by Chmrr (Vicar) on Feb 25, 2002 at 06:31 UTC

    From the GA standpoint, I should point out that you're "cheating" -- both cheating yourself, and the underlying algorithm. There are two problems: first-off, you're not allowing your best individuals to mutate, which is cheating yourself; that is, your better solutions don't get the benefit of genetic mutation. Secondly, you're using the knowledge that none of your "genes" (letters) are related in your mutation criteria, and just mutating those which are "wrong." This makes the process far less random, nearly to the point of eliminating any and all "genetics" from the process. Said another way, your above algorithm will probably go faster with a population size of one, and running for a thousand generations, than with 1000 individuals for one generation. On a similar note, you also only seem to be running one generation.

    Those are my nitpicks from my point of view as a meddler in genetic programming. From a perl point of view, you've good solid code. Some of your lines are vaguely C-ish (use of parens in print("foo")), as are your loops, but that's a stylistic thing. You're not using warnings, which might be less forgivable, though. :) In your char_replace function, you're using map when you only need to generate one character -- it looks strange. You also re-evaluate the fitness there, but don't use it.

    Update: I now see where you're re-running the generations. I'll point out that the way you're currently going about it is strange -- it means that only the first generation gets the percent that passes through unmodified -- a strange thing to do. I suppose what it boils down to is that I simply don't understand the meta-code that you're using to run your GA. It doesn't agree with the model that I understand, that's all.

    perl -pe '"I lo*`+$^X$\"$]!$/"=~m%(.*)%s;$_=$1;y^`+*^e v^#$&V"+@( NO CARRIER'

      Thanks much for you input, I will take all of that into consideration. As far as GA correctness, your very right in stating that this program is questionable, since this is it's first generation (pun intended ;).

      I still have a long way to go in understanding the GA algorithm, and even further as far as implementation.

      Thanks again, and let me know if you have other suggestions.

Re: Genetic Algorithms
by Anonymous Monk on Mar 08, 2002 at 02:14 UTC
    If you're interested in this sort of stuff, check out this.

    It says it's OO Perl, and there looks like there's something that does this. I haven't looked at the Benchmarks, but the tutorials look quite interesting...