Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Re: Re: Re: Re: Efficient run determination.

by BrowserUk (Patriarch)
on Nov 15, 2002 at 11:22 UTC ( [id://213116]=note: print w/replies, xml ) Need Help??


in reply to Re: Re: Re: Re: Efficient run determination.
in thread Efficient run determination.

I don't agree with your timing method. Including the regex generation into the timing is wrong. I the real application, the regex would be set up once at the beginning of the program or even pregenerated and read from a file or even embedded in the source. It can then be reused over and over.

In the application for which this is destined for use, it would be used on each of 500 x 500 char strings, and this process it repeated as fast as the data buffer can be filled with the aim of getting the whole down to less than 1 second, hence the need for speed.

Would you consider timing the repetition loop without the regex setup which is basically a compile time cost not a runtime one? Without the start up costs, I think you'll see a different picture.

As for the limitation on the number of matches, I used 500 as that is the limit of the string length for my application, but again this would be adjusted to need at startup.


Okay you lot, get your wings on the left, halos on the right. It's one size fits all, and "No!", you can't have a different color.
Pick up your cloud down the end and "Yes" if you get allocated a grey one they are a bit damp under foot, but someone has to get them.
Get used to the wings fast cos its an 8 hour day...unless the Govenor calls for a cyclone or hurricane, in which case 16 hour shifts are mandatory.
Just be grateful that you arrived just as the tornado season finished. Them buggers are real work.

Replies are listed 'Best First'.
Re: Re: Re: Re: Re: Re: Efficient run determination.
by PhiRatE (Monk) on Nov 15, 2002 at 11:54 UTC
    The cost of the startup thing is only included once, and even without it it makes no difference, the timing is the same, the startup cost is miniscule in comparison to the cost of the rest of the process.

    If you don't agree with me, feel free to run your own benchmarks, my program is included at the end of this message. Your algorithm, while interesting, is one of the slowest.

    Iterations: 100
    Length: 1920
    PhiRatE 1:  0.236097s  Perl/C
    PhiRatE 3:  0.234754s  Perl/C
    Dingus 1:   0.541398s  Perl
    PhiRatE 2:  0.543576s  Perl
    Dingus 2:   0.580746s  Perl + RE
    TommyW 1:   0.897865s  Perl + RE
    Enlil 2:    0.964746s  Perl + RE
    Robartes 1: 1.021243s  Perl
    Rasta 1:    2.015298s  Perl + RE
    BrowserUk:  2.764815s  Perl + RE
    

    Code for my benchmarking is here. Feel free to fiddle around to your liking.

    use Data::Dumper; use Time::HiRes qw( usleep ualarm gettimeofday tv_interval ); use re 'eval'; $stn = "aaaaaaammm38fdkkkkkkkk3,,,,,,,,,,sad909999999994lkllllllllllll +lz,,,,,,,,,dd888888882jk2kkd8d888d8djkjkjkjkkk3kk4k5kkkk65"; $iterations = 500; for (1..4) { $stn.=$stn; } print "Iterations: $iterations\n";~ print "Length: ".length($stn)."\n"; # Enlil 2 $t0 = [gettimeofday]; for (1..$iterations) { @res = enlil_2($stn); } print "Enlil 2: ".tv_interval( $t0 )."\n"; # Dingus 1 $t0 = [gettimeofday]; for (1..$iterations) { @res = dingus_1($stn); } print "Dingus 1: ".tv_interval( $t0 )."\n"; # Rasta 1 $t0 = [gettimeofday]; for (1..$iterations) { @res = rasta_1($stn); } print "Rasta 1: ".tv_interval( $t0 )."\n"; # TommyW 1 $t0 = [gettimeofday]; for (1..$iterations) { @res = tommyw_1($stn); } print "TommyW 1: ".tv_interval( $t0 )."\n"; # Robartes 1 $t0 = [gettimeofday]; for (1..$iterations) { @res = robartes_1($stn); } print "Robartes 1: ".tv_interval( $t0 )."\n"; # PhiRatE 1 $t0 = [gettimeofday]; for (1..$iterations) { @res = p_process($stn); } print "PhiRatE 1: ".tv_interval( $t0 )."\n"; # BrowserUk $t0 = [gettimeofday]; #! Set up big regex. 1-time hit. my $re ='(?:(.)(??{"$+*"}))?' x 500; $re = qr/$re/o; for (1..$iterations) { @res = browseruk($stn); } print "BrowserUk: ".tv_interval( $t0 )."\n"; # Dingus 2 $t0 = [gettimeofday]; for (1..$iterations) { @res = dingus_2($stn); } print "Dingus 2: ".tv_interval( $t0 )."\n"; # PhiRatE 2 $t0 = [gettimeofday]; for (1..$iterations) { @res = phirate_2($stn); } print "PhiRatE 2: ".tv_interval( $t0 )."\n"; # PhiRatE 3 $t0 = [gettimeofday]; for (1..$iterations) { @res = p_process_2($stn); } print "PhiRatE 3: ".tv_interval( $t0 )."\n"; sub browseruk { $_ = shift; my @c = m/$re/; #! THIS LINE DOES ALL THE WORK. #! This truncates the list to exclude null matches returned from r +egex. $#c = $#- -1; return \@c; } sub enlil_2 { my $string = shift; my @bah; while ($string =~ /((.)\2*)/g) { push (@bah, [$2,$-[1],$+[1] - $-[1]]); } return \@bah; } sub dingus_1 { my $string = shift; my (@res, $c, $p, $i); $p = 0; $c = substr($string,$p,1); for ($i=1; $i<length($string); $i++) { next if ($c eq substr($string,$i,1)); push (@res, [$c,$p,($i-$p)]); $c = substr($string,$i,1); $p = $i; } push (@res, [$c,$p,($i-$p)]); return \@res; } sub dingus_2 { my $string = shift; my (@res, $i); $i = 0; while ($string =~ /(.)\1*/g) { push (@res, [$1, $i, pos($string)-$i]); $i = pos($string); } return \@res; } sub rasta_1 { my $string = shift; my ($pp, $l, @res); $l = length($string); $pp = 0; while ($pp < $l) { $c = substr $string, $pp, 1; if ($string =~ /\G\Q$c\E+/gc) { push @res,[$c,$pp,pos($string) - $pp]; $pp = pos($string); } } return \@res; } sub tommyw_1 { my $string = shift; my $pos=0; my @triples=(); my @reps=$string=~/((.)\2*)/g; while (@reps) { my $hits=shift @reps; my $char=shift @reps; push @triples, [$char, $pos, length $hits]; $pos+=length $hits; } return \@triples; } sub robartes_1 { my $string = shift; my @res; my @listedstring= split//,$string; my $prev=shift @listedstring; my $currstart=my $index=0; for (@listedstring) { if ($_ eq $prev) { $index++; } else { push @res, [$prev, $currstart, $index-$currstart+1]; $currstart=++$index; $prev=$_; } } push @res, [$prev, $currstart, $index-$currstart+1]; return \@res; } sub phirate_2 { $_ = shift; my @res; my $count=0; my ($prev, $next); my $i=0; $prev = $next = chop($_); while ($next || $prev) { if ($prev eq $next) { $count++; } else { push @res,[$prev, $i=$count, $count]; $prev = $next; $count = 1; } $i++; $next = chop; } return \@res; } use Inline C => <<'END_OF_C_CODE'; void p_process(char *s) { char prev = 0; long count = 0; long pos = 0; long i=0; AV *array; Inline_Stack_Vars; Inline_Stack_Reset; while((*s != 0) || (prev != 0)) { if (count==0) { pos = i; prev = *s; count = 1; } else if (prev == *s) { count++; } else { array = newAV(); av_push(array,newSVpvn(&prev,1)); av_push(array,newSViv(pos)); av_push(array,newSViv(count)); Inline_Stack_Push(newRV_inc(array)); pos=i; prev = *s; count=1; } i++; s++; } Inline_Stack_Done; } void p_process_2(char *s) { char prev = 0; long count = 0; long i=0; AV *array; Inline_Stack_Vars; Inline_Stack_Reset; prev = *s; while((*s != 0) || (prev != 0)) { if (prev == *s) { count++; } else { array = newAV(); av_push(array,newSVpvn(&prev,1)); av_push(array,newSViv(i-count)); av_push(array,newSViv(count)); Inline_Stack_Push(newRV_inc(array)); prev = *s; count=1; } i++; s++; } Inline_Stack_Done; } END_OF_C_CODE

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://213116]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (4)
As of 2024-03-29 07:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found