in reply to optimization help

#1: I really doubt this is your problem. Perl's regex engine contains very smart optimizations for doing this kind of match.

#2: How big is "the whole code"? I'm curious about what eval() and evalTree() do. When you say this sub has 95% of the runtime is that inclusive of these sub-calls or exclusive? If it's inclusive then your problem could very well actually be due to what happens in them...

UPDATE: Looking closer, I'm really curious about that eval() call. As it's written it doesn't make sense - Perl won't let you call a builtin with that syntax:

$ perl -e 'eval->(print "foo\n")' foo Undefined subroutine &main:: called at -e line 1.

That made me think you must not be using the real Perl eval() builtin. But now I wonder if you made a typo and your code actually is using the eval() builtin. If that's the case then you can stop searching, I know exactly why your code is so slow! String eval() is very slow in Perl - it's actually compiling your code each time it's run!

UPDATE 2: My mistake, Perl is apparently ok with that syntax for calling string eval():

$ perl -e 'eval->(qq{print "foo\n"})' foo

-sam

Replies are listed 'Best First'.
Re^2: optimization help
by GSF (Acolyte) on Oct 15, 2007 at 22:46 UTC
    It is about this much longer:
    #!/usr/local/bin/perl -w # Based on ideas and code by bjm@paralynx.com # (Brad Murray), posted to perl-ai list June 3 1999. use strict; use Time::HiRes 'time'; require 'dumpvar.pl'; no strict "refs"; my @functions = ( 'sub { # sin(x) my $in = shift; return sin($in); }', 'sub { # add(x,y) my ($left,$right) = @_; return ($left + $right); }', 'sub { # mul(x,y) my ($left,$right) = @_; return ($left * $right); }', 'sub { # log(x) my $in = shift; return ($in == 0) ? 0 : log(abs($in)); }', ); my @terminals = ( \&identity, \&constant ); my $x; my %tree; my $maxdepth = 5; my $pop = 200; my $mates = 10; my $gen = 3; # syntax tree is a set of hashes: # # %node{left} = $nodereference or undef # %node{right} = $nodereference or undef # %node{contents} = $funcref or 'x' or constant # # sin(0.231 * log($x)) # # sin(multiply(0.231,log($x,0)),0) # # undef # sin---multiply----0.231< # | | undef undef # 0 +--------log----------$x< # | undef # 0 # # # get target data set my $inline = <DATA>; my @target = split /,/, $inline; # build a starting population of $pop my @world = map {{organism => new Organism}} 0..$pop; my $count = 0; for (1..$gen) { print "\nGENERATION ", ++$count, "\n"; foreach (@world) { $_->{fitness} = &fitness($_->{organism}{tree},@target); $_->{as_string} = $_->{organism}->as_string; } @world = sort { $a->{fitness} <=> $b->{fitness} or length $a->{as_string} <=> length $b->{as_string} } @world; foreach (@world) { print "=== Fitness: $_->{fitness} ===\n"; print $_->{as_string}, "\n"; } # Kill off some organisms, mate some others to take their spots for (0..($mates-1)) { pop @world; my ($org1, $org2) = ($world[$_]{organism}, $world[rand @world]{organism}); my $child = $org1->mate($org2); unshift @world, {organism => $child}; print "Created '@{[$child->as_string]}' from " . "'@{[$org1->as_string]}' and '@{[$org2->as_string]}'\n"; } # mutate one organism at random my $to_mutate = $world[rand @world]{organism}; print "Mutated '@{[$to_mutate->as_string]}' into "; $to_mutate->mutate; print "'@{[$to_mutate->as_string]}'\n"; } # ------------ subs sub fitness { my ($org, @target) = @_; my $sumdiff = 0; my %p; my $starttime; my $endtime; foreach (0..$#target) { $starttime = time(); $p{'x'} = $_; my $value = &evalTree($org,\%p); $sumdiff += abs($value - $target[$_]); $endtime = time(); $sumdiff += ($endtime - $starttime); } return $sumdiff/(scalar @target); } sub evalTree { my $intree = shift; my $param = shift; local $_ = $intree->{contents}; # A shortcut for the current node return 0 unless defined; if (/^sub /) { return eval->( evalTree($intree->{left},$param), evalTree($intree->{right},$param) ); } elsif (/^[A-Za-z]$/) { return $param->{$_}; } else { return $_; } } # ------------ terminal subs sub identity { return 'x'; } sub constant { return int(rand(20) - 10); } ###################################################### package Organism; use Storable qw(dclone); sub new { my $package = shift; my $self = {tree => $package->_buildTree() }; return bless $self, $package; } sub _buildTree { my $self = shift; my $depth = shift || 0; my %tree; if ($depth > $maxdepth) { $tree{contents} = &{$terminals[rand @terminals]}; } else { $tree{contents} = (int rand(3) ? $functions[rand @functions] : &{$terminals[rand @terminals]} ); } if ($tree{contents} =~ /^sub /) { $tree{left} = $self->_buildTree($depth + 1); $tree{right} = $self->_buildTree($depth + 1); } return \%tree; } sub mate { my ($self, $partner) = @_; my $self_clone = dclone($self); my $partner_clone = dclone($partner); # Get a node from $partner and stick it somewhere in $self_clone my @clone_index = $self_clone->_treeIndex; my @partner_index = $partner_clone->_treeIndex; %{ $clone_index[rand @clone_index] } = %{ $partner_index[rand @partner_index]}; return $self_clone; } sub mutate { my $self = shift; my @index = $self->_treeIndex; %{ $index[rand @index] } = %{ $self->_buildTree($maxdepth-1) }; } sub _treeIndex { # Generates a list of all the nodes in this tree. These are # references to the stuff in the object itself, so changes to the # elements of this list will change the object. my $self = shift; my $tree = shift || $self->{tree}; my @sofar = @_; # Dump the content nodes into a list if ($tree->{contents} =~ /^sub /) { return(@sofar, $tree, $self->_treeIndex($tree->{left}, @sofar), $self->_treeIndex($tree->{right}, @sofar) ); } else { return(@sofar, $tree); } } sub as_string { my $self = shift; my $tree = shift || $self->{tree}; if ($tree->{contents} =~ /^sub \{ \# (\w+)/) { return "$1(" . $self->as_string($tree->{left}) . ',' . $self->as_string($tree->{right}) . ')'; } else { return $tree->{contents}; } } #Total Elapsed Time = 5.147911 Seconds # User+System Time = 4.886911 Seconds #Exclusive Times #%Time ExclSec CumulS #Calls sec/call Csec/c Name # 97.4 4.760 4.973 153323 0.0000 0.0000 main::evalTree # 11.4 0.560 5.580 603 0.0009 0.0093 main::fitness # 4.36 0.213 0.213 71536 0.0000 0.0000 main::__ANON__ # 1.64 0.080 0.080 9733 0.0000 0.0000 Organism::as_string # 0.96 0.047 0.047 20502 0.0000 0.0000 Time::HiRes::time # 0.63 0.031 0.031 7 0.0044 0.0044 IO::File::BEGIN # 0.33 0.016 0.016 6 0.0027 0.0026 ActiveState::Path::BEGIN # 0.33 0.016 0.047 4 0.0040 0.0116 main::BEGIN # 0.31 0.015 0.015 60 0.0002 0.0002 Storable::dclone # 0.31 0.015 0.015 843 0.0000 0.0000 main::constant # 0.29 0.014 0.014 858 0.0000 0.0000 main::identity # 0.08 0.004 0.032 3198 0.0000 0.0000 Organism::_buildTree # 0.00 0.000 0.000 1 0.0000 0.0000 Config::launcher # 0.00 0.000 0.000 1 0.0000 0.0000 Config::fetch_string # 0.00 0.000 0.000 1 0.0000 0.0000 Exporter::Heavy::heavy_e +xport_to_l # evel __END__ -1407,-931,-577,-327,-163,-67,-21,-7,-7,-3,23,89,213,413,707,1113,16 +49

      Applying what I said in Re: optimization help,

      my @functions = ( sub { # sin(x) my $in = shift; return sin($in); }, sub { # add(x,y) my ($left,$right) = @_; return ($left + $right); }, sub { # mul(x,y) my ($left,$right) = @_; return ($left * $right); }, sub { # log(x) my $in = shift; return ($in == 0) ? 0 : log(abs($in)); }, ); ... sub evalTree { my $intree = shift; my $param = shift; # A shortcut for the current node. for my $contents ($intree->{contents}) { return 0 unless defined $contents; if (ref($contents) eq 'CODE') { return $contents->( evalTree($intree->{left}, $param), evalTree($intree->{right}, $param), ); } if ($contents =~ /^[A-Za-z]$/) { return $param->{$_}; } return $contents; } }

      If you need both the string and the compiled sub, save both instead of recompiling the strings over and over again.

        And how will you code the mutate step? With the B:: modules perhaps?

        UPDATE: Whoops, never mind me. I assumed it was doing some kind of textual code generation for mutate(). You're absolutely right, there's no reason not to just use sub-refs here.

        -sam

        More questions for those who are patient without limits: Refrencing a scalar as in ref($contents) when evaluated as a string, will return 'CODE' if it points to a subroutine?

        Part of my problem is my reasoning tells me I can't precompile all code because this is evolutionary code - there is no way for the compiler to know what it is going to do before it does it.

        However, we are no longer evaling, we are refrencing subs recursively. This may complicate structure parsing for reporting but maybe . . . . maybe I'm just resisting the shortest path. Thanks for the example, I'll take a crack at the rest of this....

        Thanks again.

Re^2: optimization help
by GSF (Acolyte) on Oct 15, 2007 at 22:59 UTC
    Thanks for looking closer. If you want to see what kinda amazing stuff this does turn generations up to 100, or 1000 even.

    I thought Eval was computational complexity 0 - which is a unique feature of Perl. At least that what I remember from a Perl calls in Boulder where I met Tom.

    It thinks. :)

      Perl's eval() definitely is not a constant-time operation. It definitely varies with the length and complexity of its input. I'd imagine it's at least O(n) with n being the length of the input string, but it's probably O(n*m*j*k*l) with the definition of m, j, k and l left as an exercise for the reader.

      -sam

        Alright I must have misunderstood something, which is par for the course when I'm involved.

        Thanks for the info Sam.

      Oh yeah, finally and perhaps most importantly, its not my code.