in reply to Peeling the Peelings

This one appears faster for the n > -1 case. Compared to get_proparg it is only about 7% worse for the -1 case and is about 80% faster for the 0 case:
sub arg2 { my($str,$lvl)= @_; my $str1 = $str; return $str if !$lvl; # skip up to and including nth ( paren, then strip n ) parens from + end $str =~ /([^(]*\(){$lvl}(.+)(\)){$lvl}/; return $2 unless $lvl == -1; $str =~ /\(([^\(\)]+)\)+/; return $1; }
The timing results where lvl=2 are:
Rate get_proparg arg2 get_proparg 137195/s -- -31% arg2 199115/s 45% --
HTH, --traveler

Replies are listed 'Best First'.
Re: Re: Peeling the Peelings
by bobn (Chaplain) on Jul 02, 2003 at 05:24 UTC
    But for the original sample of data, it is hideous. Results:
    Benchmark: timing 15000 iterations of mine-elegant, mine-less-elegant, + original, yours-arg2... mine-elegant: 7 wallclock secs ( 5.19 usr + 0.00 sys = 5.19 CPU) @ +2890.17/s (n=15000) mine-less-elegant: 4 wallclock secs ( 3.19 usr + 0.01 sys = 3.20 CP +U) @ 4687.50/s (n=15000) original: 6 wallclock secs ( 5.06 usr + 0.02 sys = 5.08 CPU) @ 29 +52.76/s (n=15000) yours-arg2: 32 wallclock secs (25.06 usr + 0.08 sys = 25.14 CPU) @ 59 +6.66/s (n=15000)
    The lesson I'm taking away from this: simple regexes can be very fast, but time increasses rapidly with regex complexity.

    --Bob Niederman, http://bob-n.com
      They don't need to. You just have to tell the regex engine exactly what you want. The regex he used has no anchors - why? There's also a bunch of useless capturing parens, and in fact one paren pair that's completely superfluous. All of that is not what we wanted. I've got no Perl here, so I'll have to test this later, but I'm pretty positive that the following works as specified, and very certain that it'll perform tons better.
      /\A (?> (?> [^(]* \( ) {$lvl} ) ( .+ ) \) {$lvl} \z/x;

      Makeshifts last the longest.

        Strangley enough, the benchmarks indictate this is no better. comparison w/ minor rewrites: Results:
        Benchmark: timing 7000 iterations of OP best: get_proparg_new, my best +: getpropstr, orig-arg2, yours... OP best: get_proparg_new: 3 wallclock secs ( 2.39 usr + 0.01 sys = +2.40 CPU) @ 2916.67/s (n=7000) my best: getpropstr: 4 wallclock secs ( 1.72 usr + 0.01 sys = 1.73 +CPU) @ 4046.24/s (n=7000) orig-arg2: 12 wallclock secs ( 9.71 usr + 0.04 sys = 9.75 CPU) @ 71 +7.95/s (n=7000) yours: 12 wallclock secs ( 9.69 usr + 0.03 sys = 9.72 CPU) @ 72 +0.16/s (n=7000)


        --Bob Niederman, http://bob-n.com
      You know, someone ought to write a book about that (time increasing with regex complexity). :-)
      Phooey. The lessons here are: if it looks good and seems to test well, it still might not be good -- test more; somewhat counterintuitively, {} match counts are slower than loops; some optimizations (e.g. those from Aristotle's post) are not as much more effecient as they seem; and as tilly and bobn point out a correlary of the second lesson is that processing a string through multiple single REs is often better than one complex one.

      --traveler