zli034 has asked for the wisdom of the Perl Monks concerning the following question:

Guys:

This what I got so far for a assignment. I think I used too much worng referencing, this code functions oddly. The fitness of a gene is a percentage, it should maintain highest gene of percentage, but it doesn't do that. Because the gene is an array of 2 arrays. genotype()subroutine returns gene as a reference \@seq; then it evaluate fitness, mutate and crossover by the reference.

But the best fit gene $population\49\ is always replaced by new mutants. This code is partly based on a post of this forum from another monk, I have to acknowledge that.

blokhead: http://www.perlmonks.org/?node_id=298877

Thank you for all your valuable time to read this code. Thank you guys.

#ea Version3 use strict; use List::Util qw/shuffle/; my $ca_length = 8;#size of my genotype, . my $pop_size = 50; my @data_table; # To store inputted data, my @indices = ( 76,46,21,15,33,20,69,65,40,11,79,22, 70,43,32,30,57,19,39,37,62,3,4,44,51, 58,25,72,64,59,48,80,42,38,61,75,49,1, 18,71,41,2,67,13,45,60,9,7,28,6,8,54, 50,73,56,53,77,27,16,29,52,10,24,5,26, 36,23,34,31,47,14,35,12,55,17,0,68,74, 66,78,63); #number of the columns. my $mutation_rate = 0.2; input(); open(FILEOUT, ">>output.txt"); #numbers table which only has 0 and 1. my @population = sort { fitness($a) <=> fitness($b) } map { genotype() } 1 .. $pop_size; my $generations; while ( $generations++ < 1000 and fitness($population[49]) < 100 ) { my @parents = @population[40 .. 49]; shuffle @parents; my @children; push @children, crossover( pop @parents, pop @parents ) while @parents; @population[0 .. 4] = map { mutate($_) } @children; @population = sort { fitness($a) <=> fitness($b) } @population; for my $i (0..$#population){ print fitness($population[$i])." : $i\n"; } printf "The best fit after %d generations is: %g\n", $generations, + fitness($population[49]); print "The least fit : ".fitness($population[0]); #printf "Average fitness after %d generations is: %g\n", #$generations, (sum map { fitness($_) } @population)/@populati +on; } #foreach my $element (@data_index){ #@population # print $element."\n"; #} print FILEOUT $population[49]."\n"; close FILEOUT; ##### sub fitness { my $ref = shift; my @ca_rule = @{${$ref}[1]}; my @data_indices = @{${$ref}[0]}; my $wrong_count=0; my $right_count=0; my $predict=''; for my $i ( 1 .. $#data_table ) { #$aref = $data_table[$i]; #print $aref; for my $j ( 1 .. $#data_indices-1 ) {#row index my $predict=$ca_rule[bin2dec($data_table[$i-1][$data_indic +es[$j-1]].$data_table[$i-1][$data_indices[$j]].$data_table[$i-1][$dat +a_indices[$j+1]])]; if ($predict ne $data_table[$i][$data_indices[$j]]){ $wrong_count++; }else{ $right_count++; } #print "elt $i $j is $data_table[$i][$j]\n"; } } #my $result = ($right_count)/($right_count+$wrong_count -1)*100; #print "Right: $right_count \n"; #print "Wrong: $wrong_count \n"; #print $result."%\n"; return ($right_count)/($right_count+$wrong_count -1)*100; } sub crossover{ my $temp = ${$_[0]}[0]; ${$_[0]}[0] = ${$_[1]}[0]; ${$_[1]}[0] = $temp; return ($_[0],$_[1]); } sub mutate{ my @string = @{$_[0]}; for my $i (0.. $#string){ my $aref = $string[$i]; #print "\n"; if (rand() < $mutation_rate){ my $position1 = int(rand($#{$aref})); my $position2 = int(rand($#{$aref})); if ($position1 ne $position2){ my $temp = $string[$i][$position1]; $string[$i][$position1] = $string[$i][$position2]; $string[$i][$position2] = $temp; } } } return \@string; } sub genotype { my @seq; my @gene; my @string; push @gene, shuffle @indices ; for (1..$ca_length){ $string[$_] = rand() > 0.5 ? 0 : 1 ; } push @seq, [@gene]; push @seq, [@string]; return \@seq;#return a reference of the array. } sub input{ ###### read data table into arrays################## open(FILEIN, "table.txt"); #numbers table which only has 0 and 1. while(<FILEIN>){#reads line by line from FILE which is the filehan +dle for data.txt chomp; my @data_line = split //, $_; push @data_table, [@data_line]; } close FILEIN; } sub bin2dec{ return oct"0b$_[0]"} sub dec2bin {return sprintf "%b",shift }

Replies are listed 'Best First'.
Re: Problem of referencing into subroutines.
by jwkrahn (Abbot) on Feb 18, 2008 at 23:46 UTC
    sub mutate{ my @string = @{$_[0]}; for my $i (0.. $#string){ my $aref = $string[$i]; #print "\n"; if (rand() < $mutation_rate){ my $position1 = int(rand($#{$aref})); my $position2 = int(rand($#{$aref})); if ($position1 ne $position2){ my $temp = $string[$i][$position1]; $string[$i][$position1] = $string[$i][$position2]; $string[$i][$position2] = $temp; } } } return \@string; }

    You have an off-by-one error. If @$aref contains 10 elements it will have the indices 0 to 9 and int(rand($#{$aref})) will only pick a number in the range 0 to 8.

    Also, perl has a more efficient idiom for swapping values:

    sub mutate{ my @string = @{ $_[ 0 ] }; for my $i ( 0 .. $#string ){ my $aref = $string[ $i ]; #print "\n"; if ( rand() < $mutation_rate ){ my $position1 = int rand @$aref; my $position2 = int rand @$aref; if ( $position1 != $position2 ){ @{ $string[ $i ] }[ $position1, $position2 ] = @{ $str +ing[ $i ] }[ $position2, $position1 ]; } } } return \@string; }

    Update: Also you used the wrong comparison operator. Changed ne to !=.

Re: Problem of referencing into subroutines.
by kyle (Abbot) on Feb 18, 2008 at 22:30 UTC

    I'll start by saying that the reference dancing here makes my eyes cross, so take my remarks with some salt. Also, since I don't have your input file or a good idea of what this is to produce, I can't really test. That said, look here...

    my @parents = @population[40 .. 49]; shuffle @parents; my @children; push @children, crossover( pop @parents, pop @parents ) while @parents; @population[0 .. 4] = map { mutate($_) } @children;

    The first thing I notice is that crossover returns two things, so @children will have the same length as @parents, but you're discarding half of them when you put them back into @population. I can't tell if that's deliberate or not, but it easily could be.

    What I don't think is deliberate is that mutate is operating directly on the items it gets.

    You copy into @string, which is good, and then pass out a reference to it, which is also good, but in the meantime you do this:

    my $temp = $string[$i][$position1]; $string[$i][$position1] = $string[$i][$position2]; $string[$i][$position2] = $temp;

    I think $string[$i] is a copy and so safe to modify, but $string[$i][$position1] is a reference that also exists in the original array. What you pass in are references from @children, but those also come directly from @parents.

    To fix this, it might help to make a deep copy of the data structures involved before you operate on them. These might help with that:

    Again, I'm not sure that's the problem, but it might be worth looking into.

Re: Problem of referencing into subroutines.
by hipowls (Curate) on Feb 18, 2008 at 22:27 UTC

    This is not an answer but I am puzzled by your use of @indices. It contains the integers from 0 to 80 and is used in only one function

    my @indices = ( 76, 46, 21, 15, 33, 20, 69, 65, 40, 11, 79, 22, 70, 43, 32, 30, 57 +, 19, 39, 37, 62, 3, 4, 44, 51, 58, 25, 72, 64, 59, 48, 80, 42, 38, 61 +, 75, 49, 1, 18, 71, 41, 2, 67, 13, 45, 60, 9, 7, 28, 6, 8, 54, 50 +, 73, 56, 53, 77, 27, 16, 29, 52, 10, 24, 5, 26, 36, 23, 34, 31, 47, 14 +, 35, 12, 55, 17, 0, 68, 74, 66, 78, 63 ); #number of the columns. ... sub genotype { my @seq; my @gene; my @string; push @gene, shuffle @indices; ... }
    this can be rewritten as
    my $max_index = 80; ... sub genotype { my @seq; my @gene = shuffle 0 .. $max_index; my @string; ... }
    which makes your intent clearer and avoids the possiblity of missing an integer.