optimization help

GSF has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: optimization help by ikegami (Patriarch) on Oct 15, 2007 at 23:00 UTC
`substr`, and `eq` (lexical comparison) instead of `=` (assignment). `if (substr($_, 0, 4) eq "sub ")` [download] `local $_ = $intree->{contents};` will copy the string. Creating an alias would be better. `for ($intree->{contents})` [download] However, I don't really see the use of `$_`. You're simply obfuscating your code. Perhaps you could alias the expression to something shorter yet meaninful. `for my $contents ($intree->{contents})` [download] As for the question, `eval EXPR` parses and generates an opcode tree from the supplied source code. That's inherently slow. If the subs already exist, you'd be better off passing a reference to the sub rather than the source of the sub. `if (ref($contents) eq 'CODE') { return $contents->( evalTree($intree->{left}, $param), evalTree($intree->{right}, $param), ); }` [download] You can get a reference to an existing sub as follows: `sub func { ... } my $ref = \&func;` [download] Or you can create a reference to an anonymous sub as follows: `my $ref = sub { ... };` [download]	[reply] [d/l] [select]
Re^2: optimization help by GSF (Acolyte) on Oct 16, 2007 at 00:30 UTC
Thank you very much. It makes my head hurt reading it but... I will keep cracking at at. I don't feel so directionless anymore :)	[reply]
Re: optimization help by samtregar (Abbot) on Oct 15, 2007 at 22:42 UTC
#1: I really doubt this is your problem. Perl's regex engine contains very smart optimizations for doing this kind of match. #2: How big is "the whole code"? I'm curious about what eval() and evalTree() do. When you say this sub has 95% of the runtime is that inclusive of these sub-calls or exclusive? If it's inclusive then your problem could very well actually be due to what happens in them... UPDATE: Looking closer, I'm really curious about that eval() call. As it's written it doesn't make sense - Perl won't let you call a builtin with that syntax: `$ perl -e 'eval->(print "foo\n")' foo Undefined subroutine &main:: called at -e line 1.` [download] That made me think you must not be using the real Perl eval() builtin. But now I wonder if you made a typo and your code actually is using the eval() builtin. If that's the case then you can stop searching, I know exactly why your code is so slow! String eval() is very slow in Perl - it's actually compiling your code each time it's run! UPDATE 2: My mistake, Perl is apparently ok with that syntax for calling string eval(): `$ perl -e 'eval->(qq{print "foo\n"})' foo` [download] -sam	[reply] [d/l] [select]
Re^2: optimization help by GSF (Acolyte) on Oct 15, 2007 at 22:46 UTC
It is about this much longer: #!/usr/local/bin/perl -w # Based on ideas and code by bjm@paralynx.com # (Brad Murray), posted to perl-ai list June 3 1999. use strict; use Time::HiRes 'time'; require 'dumpvar.pl'; no strict "refs"; my @functions = ( 'sub { # sin(x) my $in = shift; return sin($in); }', 'sub { # add(x,y) my ($left,$right) = @_; return ($left + $right); }', 'sub { # mul(x,y) my ($left,$right) = @_; return ($left * $right); }', 'sub { # log(x) my $in = shift; return ($in == 0) ? 0 : log(abs($in)); }', ); my @terminals = ( \&identity, \&constant ); my $x; my %tree; my $maxdepth = 5; my $pop = 200; my $mates = 10; my $gen = 3; # syntax tree is a set of hashes: # # %node{left} = $nodereference or undef # %node{right} = $nodereference or undef # %node{contents} = $funcref or 'x' or constant # # sin(0.231 * log($x)) # # sin(multiply(0.231,log($x,0)),0) # # undef # sin---multiply----0.231< # \| \| undef undef # 0 +--------log----------$x< # \| undef # 0 # # # get target data set my $inline = <DATA>; my @target = split /,/, $inline; # build a starting population of $pop my @world = map {{organism => new Organism}} 0..$pop; my $count = 0; for (1..$gen) { print "\nGENERATION ", ++$count, "\n"; foreach (@world) { $_->{fitness} = &fitness($_->{organism}{tree},@target); $_->{as_string} = $_->{organism}->as_string; } @world = sort { $a->{fitness} <=> $b->{fitness} or length $a->{as_string} <=> length $b->{as_string} } @world; foreach (@world) { print "=== Fitness: $_->{fitness} ===\n"; print $_->{as_string}, "\n"; } # Kill off some organisms, mate some others to take their spots for (0..($mates-1)) { pop @world; my ($org1, $org2) = ($world[$_]{organism}, $world[rand @world]{organism}); my $child = $org1->mate($org2); unshift @world, {organism => $child}; print "Created '@{[$child->as_string]}' from " . "'@{[$org1->as_string]}' and '@{[$org2->as_string]}'\n"; } # mutate one organism at random my $to_mutate = $world[rand @world]{organism}; print "Mutated '@{[$to_mutate->as_string]}' into "; $to_mutate->mutate; print "'@{[$to_mutate->as_string]}'\n"; } # ------------ subs sub fitness { my ($org, @target) = @_; my $sumdiff = 0; my %p; my $starttime; my $endtime; foreach (0..$#target) { $starttime = time(); $p{'x'} = $_; my $value = &evalTree($org,\%p); $sumdiff += abs($value - $target[$_]); $endtime = time(); $sumdiff += ($endtime - $starttime); } return $sumdiff/(scalar @target); } sub evalTree { my $intree = shift; my $param = shift; local $_ = $intree->{contents}; # A shortcut for the current node return 0 unless defined; if (/^sub /) { return eval->( evalTree($intree->{left},$param), evalTree($intree->{right},$param) ); } elsif (/^[A-Za-z]$/) { return $param->{$_}; } else { return $_; } } # ------------ terminal subs sub identity { return 'x'; } sub constant { return int(rand(20) - 10); } ###################################################### package Organism; use Storable qw(dclone); sub new { my $package = shift; my $self = {tree => $package->_buildTree() }; return bless $self, $package; } sub _buildTree { my $self = shift; my $depth = shift \|\| 0; my %tree; if ($depth > $maxdepth) { $tree{contents} = &{$terminals[rand @terminals]}; } else { $tree{contents} = (int rand(3) ? $functions[rand @functions] : &{$terminals[rand @terminals]} ); } if ($tree{contents} =~ /^sub /) { $tree{left} = $self->_buildTree($depth + 1); $tree{right} = $self->_buildTree($depth + 1); } return \%tree; } sub mate { my ($self, $partner) = @_; my $self_clone = dclone($self); my $partner_clone = dclone($partner); # Get a node from $partner and stick it somewhere in $self_clone my @clone_index = $self_clone->_treeIndex; my @partner_index = $partner_clone->_treeIndex; %{ $clone_index[rand @clone_index] } = %{ $partner_index[rand @partner_index]}; return $self_clone; } sub mutate { my $self = shift; my @index = $self->_treeIndex; %{ $index[rand @index] } = %{ $self->_buildTree($maxdepth-1) }; } sub _treeIndex { # Generates a list of all the nodes in this tree. These are # references to the stuff in the object itself, so changes to the # elements of this list will change the object. my $self = shift; my $tree = shift \|\| $self->{tree}; my @sofar = @_; # Dump the content nodes into a list if ($tree->{contents} =~ /^sub /) { return(@sofar, $tree, $self->_treeIndex($tree->{left}, @sofar), $self->_treeIndex($tree->{right}, @sofar) ); } else { return(@sofar, $tree); } } sub as_string { my $self = shift; my $tree = shift \|\| $self->{tree}; if ($tree->{contents} =~ /^sub \{ \# (\w+)/) { return "$1(" . $self->as_string($tree->{left}) . ',' . $self->as_string($tree->{right}) . ')'; } else { return $tree->{contents}; } } #Total Elapsed Time = 5.147911 Seconds # User+System Time = 4.886911 Seconds #Exclusive Times #%Time ExclSec CumulS #Calls sec/call Csec/c Name # 97.4 4.760 4.973 153323 0.0000 0.0000 main::evalTree # 11.4 0.560 5.580 603 0.0009 0.0093 main::fitness # 4.36 0.213 0.213 71536 0.0000 0.0000 main::__ANON__ # 1.64 0.080 0.080 9733 0.0000 0.0000 Organism::as_string # 0.96 0.047 0.047 20502 0.0000 0.0000 Time::HiRes::time # 0.63 0.031 0.031 7 0.0044 0.0044 IO::File::BEGIN # 0.33 0.016 0.016 6 0.0027 0.0026 ActiveState::Path::BEGIN # 0.33 0.016 0.047 4 0.0040 0.0116 main::BEGIN # 0.31 0.015 0.015 60 0.0002 0.0002 Storable::dclone # 0.31 0.015 0.015 843 0.0000 0.0000 main::constant # 0.29 0.014 0.014 858 0.0000 0.0000 main::identity # 0.08 0.004 0.032 3198 0.0000 0.0000 Organism::_buildTree # 0.00 0.000 0.000 1 0.0000 0.0000 Config::launcher # 0.00 0.000 0.000 1 0.0000 0.0000 Config::fetch_string # 0.00 0.000 0.000 1 0.0000 0.0000 Exporter::Heavy::heavy_e +xport_to_l # evel __END__ -1407,-931,-577,-327,-163,-67,-21,-7,-7,-3,23,89,213,413,707,1113,16 +49 [download]	[reply] [d/l]
Re^3: optimization help by ikegami (Patriarch) on Oct 15, 2007 at 23:05 UTC
Applying what I said in Re: optimization help, my @functions = ( sub { # sin(x) my $in = shift; return sin($in); }, sub { # add(x,y) my ($left,$right) = @_; return ($left + $right); }, sub { # mul(x,y) my ($left,$right) = @_; return ($left * $right); }, sub { # log(x) my $in = shift; return ($in == 0) ? 0 : log(abs($in)); }, ); ... sub evalTree { my $intree = shift; my $param = shift; # A shortcut for the current node. for my $contents ($intree->{contents}) { return 0 unless defined $contents; if (ref($contents) eq 'CODE') { return $contents->( evalTree($intree->{left}, $param), evalTree($intree->{right}, $param), ); } if ($contents =~ /^[A-Za-z]$/) { return $param->{$_}; } return $contents; } } [download] If you need both the string and the compiled sub, save both instead of recompiling the strings over and over again.	[reply] [d/l]
Re^4: optimization help by samtregar (Abbot) on Oct 15, 2007 at 23:06 UTC
Re^4: optimization help by GSF (Acolyte) on Oct 16, 2007 at 16:24 UTC
Re^5: optimization help by ikegami (Patriarch) on Oct 16, 2007 at 16:29 UTC
Some notes below your chosen depth have not been shown here
Re^2: optimization help by GSF (Acolyte) on Oct 15, 2007 at 22:59 UTC
Thanks for looking closer. If you want to see what kinda amazing stuff this does turn generations up to 100, or 1000 even. I thought Eval was computational complexity 0 - which is a unique feature of Perl. At least that what I remember from a Perl calls in Boulder where I met Tom. It thinks. :)	[reply]
Re^3: optimization help by samtregar (Abbot) on Oct 15, 2007 at 23:04 UTC
Perl's eval() definitely is not a constant-time operation. It definitely varies with the length and complexity of its input. I'd imagine it's at least O(n) with n being the length of the input string, but it's probably O(nmjkl) with the definition of m, j, k and l left as an exercise for the reader. -sam	[reply]
Re^4: optimization help by GSF (Acolyte) on Oct 16, 2007 at 00:32 UTC
Re^3: optimization help by GSF (Acolyte) on Oct 15, 2007 at 23:01 UTC
Oh yeah, finally and perhaps most importantly, its not my code.	[reply]
Re: optimization help by naikonta (Curate) on Oct 16, 2007 at 01:01 UTC
`if (/^sub /) { return eval->( evalTree($intree->{left},$param), evalTree($intree->{right},$param) );` [download] This is tricky, but its trickiness doesn't help in any means. I needed to lookup `perldoc -f eval` again to make myself sure that eval is one of them that takes `$_`. I still wonder, though, experienced monk like samtregar got bitten too (sorry, Sam) :-) I still thank him for giving me a little hint (thank you, Sam). This reminds me of a friend that made use of eval in a "unique" way, the first time I saw that time. `#!/usr/bin/perl # no -w # no strict eval { ... the rest of the program... }; die $@ if $@;` [download] There was nothing wrong with this program, syntatically, and it worked. I coudn't explain then, but I knew this was not right so I rejected his code and told him to code in the usual manner. Now I can say that unnecessary double evaluation is evil. These two examples prove a) TMTOWTDI principle, b) not all W's are equal. Otherwise your tomatoes, rotten cabbage, and half eaten apples are appreciated No, I won't do that, especially when I know that this is not originally your own code. Even if it was, there was still no reason to do so. You asked in the proper manner. ikegami has said what was wrong and the correction as well. Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!	[reply] [d/l] [select]
Re^2: optimization help by chromatic (Archbishop) on Oct 16, 2007 at 05:43 UTC
eval BLOCK and eval STRING are completely different. It's unfortunate that they share the same name. There's (almost) nothing wrong with the code your friend wrote.	[reply]
Re^3: optimization help by naikonta (Curate) on Oct 17, 2007 at 08:09 UTC
Agreed. It was an `eval BLOCK` so it happened at compiled-time. I just didn't (and still don't) see any advantage other than aggregating fatal errors checking, such as `my $dbh = get_db(); # with RaiseError => 1 eval { my $some_sql = get_some_sql_code(); $dbh->do($some_sql); my $another_sql = get_another_sql_code(); $dbh->do($another_sql); ... more query execution }; die $@ if $@;` [download] Open source softwares? Share and enjoy. Make profit from them if you can. Yet, share and enjoy!	[reply] [d/l] [select]