Additionally, your code is unfair. In your benchmark, all solutions are being timed for creating a structure to return an add and then returning that add each time. The trouble is that, in my method (and others), we only have to invest that time once because we do not destroy it but your solution does. You would need to alter the bench to be fair.Well, if so I should then count time to serialize and deserialize that structure, since as far as I can tell the OP was looking for something that would be useful in a cgi script. (otherwise, his original complaints about performance make little sense) But for the sake of argument, let's refactor the benchmark so that each method has a setup step and a run step, and see what we get. I'll call the way I tested things before "cgi style" and the build-structure-once, extract-many method "mod_perl style". (I've eliminated QM's code until he can provide a corrected version, and added ccn's second version. I've also cut the correctness tests from this output for clarity)
ccn_fast is the version ccn posted in response to my benchmark the first time. ccn_faster is a minor tweak I made to ccn_fast to have it not use an additional variable on repeated invocations.==== Speed tests ==== As cgi: (from scratch each time): Rate origCode duff L~R ccn fizbin ccn_fast +ccn_faster origCode 3799/s -- -33% -83% -87% -89% -96% + -97% duff 5671/s 49% -- -75% -80% -84% -94% + -95% L~R 23009/s 506% 306% -- -19% -36% -75% + -79% ccn 28461/s 649% 402% 24% -- -21% -69% + -75% fizbin 36068/s 849% 536% 57% 27% -- -61% + -68% ccn_fast 92559/s 2337% 1532% 302% 225% 157% -- + -17% ccn_faster 111920/s 2846% 1874% 386% 293% 210% 21% + -- As mod_perl: (NOT from scratch each time): Rate fizbin ccn_fast ccn ccn_faster L~R origCo +de duff fizbin 36073/s -- -61% -62% -68% -92% -9 +2% -92% ccn_fast 92123/s 155% -- -2% -17% -80% -8 +0% -80% ccn 94311/s 161% 2% -- -15% -79% -8 +0% -80% ccn_faster 111200/s 208% 21% 18% -- -75% -7 +6% -76% L~R 450572/s 1149% 389% 378% 305% -- - +3% -3% origCode 462599/s 1182% 402% 391% 316% 3% +-- -1% duff 466249/s 1193% 406% 394% 319% 3% +1% --
A few curious things to notice here - first off, using a mod_perl-style test, the original code is almost as fast as possible. (or at least, "almost as fast as what anyone else has put forward") So my first suggestion to the OP would be: if performance is your problem, find a way to switch to mod_perl and do the building of the array once. (but consider the disadvantages others have mentioned of your code only allowing one ad. per percentage value) Secondly, although the code for the methods duff and origCode is identical after the initial building of the data structure, duff was consistently (I ran this test program a few times) about 1% faster. I have no explanation for this, though I'd like one from someone with intimate knowledge of perl internals. (perl version 5.8.2 as shipped with cygwin)
And yes, my code gets spanked in the mod_perl scenario, but that's expected. What I hadn't expected is just how blinding fast ccn_fast wound up being, enough so that it's still almost competitive under a mod_perl scenario.
And now, the improved benchmark code:
#!perl use strict; use Benchmark qw(cmpthese); my $str = "17:43:33:21:23:19:27:6"; my $setup_codehash = { origCode => sub { my @ary; my %hash = split /:/, $str; foreach my $k ( keys %hash ) { push( @ary, map { $hash{$k} } ( 1 .. $k ) ); } \@ary; }, ccn => sub { my %hash = split /:/, $str; \%hash, }, "L~R" => sub { my $lookup; my %hash = reverse split /:/, $str; while ( my ( $key, $val ) = each %hash ) { $lookup .= pack( "C*", ($key) x $val ); } $lookup; }, duff => sub { my @ary; my @a = split /:/, $str; @a % 2 && die; # not an even number of items while ( my ( $p, $ad ) = splice @a, 0, 2 ) { push @ary, ($ad) x $p; } \@ary; }, }; my $codehash = { origCode => sub { my $adId = $_[0]->[ int( rand(100) ) ]; }, ccn => sub { my $adno; my $rand = rand 100; my $sum = 0; for ( keys %{$_[0]} ) { # there is no need of sorted keys $adno = $_[0]->{$_}; last if ( $sum += $_ ) > $rand; } $adno; }, "L~R" => sub { my $addid = unpack( "C", substr( $_[0], rand 100, 1 ) ); }, duff => sub { my $adId = $_[0]->[ int( rand(100) ) ]; }, fizbin => sub { my @a = split /:/, $str; my $r = int( rand(100) ); my ( $adId, $p ); while ( ( $p, $adId ) = splice @a, 0, 2 ) { if ( $r < $p ) { last; } $r -= $p; } $adId; }, ccn_fast => sub { my $rand = rand 100; my $mystr = $str; # this line is needed for repetitive test on +ly 1 while $mystr =~ /([^:]+):([^:]+):?/g and (($rand -= $1) > 0) +; $2; }, ccn_faster => sub { my $rand = rand 100; pos($str)=0; # this line is needed for repetitive test only 1 while $str =~ /([^:]+):([^:]+):?/g and (($rand -= $1) > 0); $2; }, }; sub phash (%) { my %h = @_; return "{" . join( ", ", map { sprintf( '%s => %3.3f', $_, $h{$_} ); } sort { $a <=> $b } keys(%h) ) . "}"; } sub make_cgistyle_routine ($) { my $subname = shift; if ($setup_codehash->{$subname}) { return sub {$codehash->{$subname}->($setup_codehash->{$subname +}->());}; } else { return sub {$codehash->{$subname}->();}; } } sub make_mpstyle_routine ($) { my $subname = shift; if ($setup_codehash->{$subname}) { my @setupvals = $setup_codehash->{$subname}->(); return sub {$codehash->{$subname}->(@setupvals);}; } else { return sub {$codehash->{$subname}->();}; } } print "==== Correctness tests ==== \n\n"; my %strhash = reverse split( /:/, $str ); print "Probabilities are: ", phash(%strhash), "\n"; foreach my $subname ( keys %$codehash ) { my %resultshash = map { $_ => 0 } keys %strhash; my $routine = make_mpstyle_routine($subname); do { $resultshash{ $routine->() } += 0.001; } for ( 1 .. 100000 ); print "$subname yielded ", phash(%resultshash), "\n"; do { print "$subname failed\n" and last if ( abs( $resultshash{$_} - $strhash{$_} ) > 0.3 ); } for keys(%strhash); } print "\n==== Speed tests ====\n"; print "\nAs cgi: (from scratch each time):\n"; cmpthese( -5, {map {$_ => make_cgistyle_routine($_)} keys %$codehash }); print "\nAs mod_perl: (NOT from scratch each time):\n"; cmpthese( -5, {map {$_ => make_mpstyle_routine($_) } keys %$codehash });
-- @/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/; map{y/X_/\n /;print}map{pop@$_}@/for@/
In reply to Re^4: $str to %hash to @ary
by fizbin
in thread $str to %hash to @ary
by abaxaba
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |