Problem of referencing into subroutines.

zli034 has asked for the wisdom of the Perl Monks concerning the following question:

Guys:

This what I got so far for a assignment. I think I used too much worng referencing, this code functions oddly. The fitness of a gene is a percentage, it should maintain highest gene of percentage, but it doesn't do that. Because the gene is an array of 2 arrays. genotype()subroutine returns gene as a reference \@seq; then it evaluate fitness, mutate and crossover by the reference.

But the best fit gene $population\49\ is always replaced by new mutants. This code is partly based on a post of this forum from another monk, I have to acknowledge that.

blokhead: http://www.perlmonks.org/?node_id=298877

Thank you for all your valuable time to read this code. Thank you guys.

#ea Version3

use strict;
use List::Util qw/shuffle/;

my $ca_length = 8;#size of my genotype, . 
my $pop_size   = 50;
my @data_table; # To store inputted data, 
my @indices = (    76,46,21,15,33,20,69,65,40,11,79,22,
                70,43,32,30,57,19,39,37,62,3,4,44,51,
                58,25,72,64,59,48,80,42,38,61,75,49,1,
                18,71,41,2,67,13,45,60,9,7,28,6,8,54,
                50,73,56,53,77,27,16,29,52,10,24,5,26,
                36,23,34,31,47,14,35,12,55,17,0,68,74,
                66,78,63); #number of the columns.
my $mutation_rate = 0.2;

input();
open(FILEOUT, ">>output.txt"); #numbers table which only has 0 and 1.


my @population = sort { fitness($a) <=> fitness($b) }
                 map  { genotype() } 1 .. $pop_size;

my $generations;
while ( $generations++ < 1000 and fitness($population[49]) < 100 ) {
    my @parents = @population[40 .. 49];
    shuffle @parents;

    my @children;
    push @children, crossover( pop @parents, pop @parents )
        while @parents;

    @population[0 .. 4] = map { mutate($_) } @children;

    @population = sort { fitness($a) <=> fitness($b) } @population;
    for my $i (0..$#population){
        print fitness($population[$i])." : $i\n";
    }
    printf "The best fit after %d generations is: %g\n", $generations,
+ fitness($population[49]);
    print "The least fit : ".fitness($population[0]);
    #printf "Average fitness after %d generations is: %g\n",
        #$generations, (sum map { fitness($_) } @population)/@populati
+on;
}

#foreach my $element (@data_index){ #@population
#    print $element."\n";
#}

print FILEOUT $population[49]."\n";
close FILEOUT;
#####

sub fitness {
    my $ref = shift;


    my @ca_rule = @{${$ref}[1]};
    my @data_indices = @{${$ref}[0]};
    my $wrong_count=0;
    my $right_count=0;
    my $predict='';        

    for my $i ( 1 .. $#data_table ) {
        #$aref = $data_table[$i];
        #print $aref;
        for my $j ( 1 .. $#data_indices-1 ) {#row index
        
            my $predict=$ca_rule[bin2dec($data_table[$i-1][$data_indic
+es[$j-1]].$data_table[$i-1][$data_indices[$j]].$data_table[$i-1][$dat
+a_indices[$j+1]])];

            if ($predict ne $data_table[$i][$data_indices[$j]]){
                $wrong_count++;
            }else{
                $right_count++;
            }        
            #print "elt $i $j is $data_table[$i][$j]\n";
        }
    }
    
    #my $result = ($right_count)/($right_count+$wrong_count -1)*100;
    #print "Right: $right_count \n";
    #print "Wrong: $wrong_count \n";
    #print $result."%\n";
    return ($right_count)/($right_count+$wrong_count -1)*100;
}

sub crossover{
        my $temp = ${$_[0]}[0];
        ${$_[0]}[0] = ${$_[1]}[0];
        ${$_[1]}[0] = $temp;              
        return ($_[0],$_[1]);    
}

sub mutate{
    my @string = @{$_[0]};
    for my $i (0.. $#string){
        my $aref = $string[$i];
        #print "\n";
        if (rand() < $mutation_rate){
            my $position1 = int(rand($#{$aref}));
            my $position2 = int(rand($#{$aref}));
            if ($position1 ne $position2){
                my $temp = $string[$i][$position1];
                $string[$i][$position1] = $string[$i][$position2];
                $string[$i][$position2] = $temp;
            }        
        }
    }
    return \@string;
}    

sub genotype {
    my @seq;
    my @gene;
    my @string;
    push @gene, shuffle @indices ;
    for (1..$ca_length){
        $string[$_] = rand() > 0.5 ? 0 : 1 ;
    }
    push @seq, [@gene];
    push @seq, [@string];
    return \@seq;#return a reference of the array.
}

sub input{
    ###### read data table into arrays##################
    open(FILEIN, "table.txt"); #numbers table which only has 0 and 1.
    while(<FILEIN>){#reads line by line from FILE which is the filehan
+dle for data.txt
        chomp;
           my @data_line = split //, $_;
           push @data_table, [@data_line];
    }
    close FILEIN;
}

sub bin2dec{ return oct"0b$_[0]"}
sub dec2bin {return sprintf "%b",shift }
[download]

Comment on Problem of referencing into subroutines. Download Code

Replies are listed 'Best First'.
Re: Problem of referencing into subroutines. by jwkrahn (Abbot) on Feb 18, 2008 at 23:46 UTC
`sub mutate{ my @string = @{$_[0]}; for my $i (0.. $#string){ my $aref = $string[$i]; #print "\n"; if (rand() < $mutation_rate){ my $position1 = int(rand($#{$aref})); my $position2 = int(rand($#{$aref})); if ($position1 ne $position2){ my $temp = $string[$i][$position1]; $string[$i][$position1] = $string[$i][$position2]; $string[$i][$position2] = $temp; } } } return \@string; }` [download] You have an off-by-one error. If `@$aref` contains 10 elements it will have the indices 0 to 9 and `int(rand($#{$aref}))` will only pick a number in the range 0 to 8. Also, perl has a more efficient idiom for swapping values: `sub mutate{ my @string = @{ $_[ 0 ] }; for my $i ( 0 .. $#string ){ my $aref = $string[ $i ]; #print "\n"; if ( rand() < $mutation_rate ){ my $position1 = int rand @$aref; my $position2 = int rand @$aref; if ( $position1 != $position2 ){ @{ $string[ $i ] }[ $position1, $position2 ] = @{ $str +ing[ $i ] }[ $position2, $position1 ]; } } } return \@string; }` [download] Update: Also you used the wrong comparison operator. Changed `ne` to `!=`.	[reply] [d/l] [select]
Re: Problem of referencing into subroutines. by kyle (Abbot) on Feb 18, 2008 at 22:30 UTC
I'll start by saying that the reference dancing here makes my eyes cross, so take my remarks with some salt. Also, since I don't have your input file or a good idea of what this is to produce, I can't really test. That said, look here... `my @parents = @population[40 .. 49]; shuffle @parents; my @children; push @children, crossover( pop @parents, pop @parents ) while @parents; @population[0 .. 4] = map { mutate($_) } @children;` [download] The first thing I notice is that `crossover` returns two things, so `@children` will have the same length as `@parents`, but you're discarding half of them when you put them back into `@population`. I can't tell if that's deliberate or not, but it easily could be. What I don't think is deliberate is that `mutate` is operating directly on the items it gets. Read more... (831 Bytes) You copy into `@string`, which is good, and then pass out a reference to it, which is also good, but in the meantime you do this: `my $temp = $string[$i][$position1]; $string[$i][$position1] = $string[$i][$position2]; $string[$i][$position2] = $temp;` [download] I think `$string[$i]` is a copy and so safe to modify, but `$string[$i][$position1]` is a reference that also exists in the original array. What you pass in are references from `@children`, but those also come directly from `@parents`. To fix this, it might help to make a deep copy of the data structures involved before you operate on them. These might help with that: deep copy of nested data structure deepcopy perlfaq4: How do I print out or copy a recursive data structure? Again, I'm not sure that's the problem, but it might be worth looking into.	[reply] [d/l] [select]
Re: Problem of referencing into subroutines. by hipowls (Curate) on Feb 18, 2008 at 22:27 UTC
This is not an answer but I am puzzled by your use of @indices. It contains the integers from 0 to 80 and is used in only one function `my @indices = ( 76, 46, 21, 15, 33, 20, 69, 65, 40, 11, 79, 22, 70, 43, 32, 30, 57 +, 19, 39, 37, 62, 3, 4, 44, 51, 58, 25, 72, 64, 59, 48, 80, 42, 38, 61 +, 75, 49, 1, 18, 71, 41, 2, 67, 13, 45, 60, 9, 7, 28, 6, 8, 54, 50 +, 73, 56, 53, 77, 27, 16, 29, 52, 10, 24, 5, 26, 36, 23, 34, 31, 47, 14 +, 35, 12, 55, 17, 0, 68, 74, 66, 78, 63 ); #number of the columns. ... sub genotype { my @seq; my @gene; my @string; push @gene, shuffle @indices; ... }` [download] this can be rewritten as `my $max_index = 80; ... sub genotype { my @seq; my @gene = shuffle 0 .. $max_index; my @string; ... }` [download] which makes your intent clearer and avoids the possiblity of missing an integer.	[reply] [d/l] [select]