Re: Peeling the Peelings

This one appears faster for the n > -1 case. Compared to get_proparg it is only about 7% worse for the -1 case and is about 80% faster for the 0 case:

sub arg2 {
    my($str,$lvl)= @_;
    my $str1 = $str;
    return $str if !$lvl;
    # skip up to and including nth ( paren, then strip n ) parens from
+ end
    $str =~ /([^(]*\(){$lvl}(.+)(\)){$lvl}/;

    return $2 unless $lvl == -1;
    $str =~ /\(([^\(\)]+)\)+/;
    return $1;

}
[download]

The timing results where lvl=2 are:

                Rate get_proparg        arg2
get_proparg 137195/s          --        -31%
arg2        199115/s         45%          --
[download]

HTH, --traveler

Comment on Re: Peeling the Peelings Select or Download Code

Replies are listed 'Best First'.
Re: Re: Peeling the Peelings by bobn (Chaplain) on Jul 02, 2003 at 05:24 UTC
But for the original sample of data, it is hideous. Read more... (3 kB) Results: `Benchmark: timing 15000 iterations of mine-elegant, mine-less-elegant, + original, yours-arg2... mine-elegant: 7 wallclock secs ( 5.19 usr + 0.00 sys = 5.19 CPU) @ +2890.17/s (n=15000) mine-less-elegant: 4 wallclock secs ( 3.19 usr + 0.01 sys = 3.20 CP +U) @ 4687.50/s (n=15000) original: 6 wallclock secs ( 5.06 usr + 0.02 sys = 5.08 CPU) @ 29 +52.76/s (n=15000) yours-arg2: 32 wallclock secs (25.06 usr + 0.08 sys = 25.14 CPU) @ 59 +6.66/s (n=15000)` [download] The lesson I'm taking away from this: simple regexes can be very fast, but time increasses rapidly with regex complexity. --Bob Niederman, http://bob-n.com	[reply] [d/l] [select]
Re^3: Peeling the Peelings by Aristotle (Chancellor) on Jul 02, 2003 at 08:05 UTC
They don't need to. You just have to tell the regex engine exactly what you want. The regex he used has no anchors - why? There's also a bunch of useless capturing parens, and in fact one paren pair that's completely superfluous. All of that is not what we wanted. I've got no Perl here, so I'll have to test this later, but I'm pretty positive that the following works as specified, and very certain that it'll perform tons better. `/\A (?> (?> [^(]* $ ) {$lvl} ) ( .+ ) $ {$lvl} \z/x;` [download] Makeshifts last the longest.	[reply] [d/l]
Re: Re^3: Peeling the Peelings by bobn (Chaplain) on Jul 02, 2003 at 13:30 UTC
Strangley enough, the benchmarks indictate this is no better. comparison w/ minor rewrites: Read more... (2 kB) Results: `Benchmark: timing 7000 iterations of OP best: get_proparg_new, my best +: getpropstr, orig-arg2, yours... OP best: get_proparg_new: 3 wallclock secs ( 2.39 usr + 0.01 sys = +2.40 CPU) @ 2916.67/s (n=7000) my best: getpropstr: 4 wallclock secs ( 1.72 usr + 0.01 sys = 1.73 +CPU) @ 4046.24/s (n=7000) orig-arg2: 12 wallclock secs ( 9.71 usr + 0.04 sys = 9.75 CPU) @ 71 +7.95/s (n=7000) yours: 12 wallclock secs ( 9.69 usr + 0.03 sys = 9.72 CPU) @ 72 +0.16/s (n=7000)` [download] --Bob Niederman, http://bob-n.com	[reply] [d/l] [select]
Re: Re: Re: Peeling the Peelings by tilly (Archbishop) on Jul 02, 2003 at 08:02 UTC
You know, someone ought to write a book about that (time increasing with regex complexity). :-)	[reply]
Re: Re: Re: Peeling the Peelings by traveler (Parson) on Jul 02, 2003 at 15:56 UTC
Phooey. The lessons here are: if it looks good and seems to test well, it still might not be good -- test more; somewhat counterintuitively, {} match counts are slower than loops; some optimizations (e.g. those from Aristotle's post) are not as much more effecient as they seem; and as tilly and bobn point out a correlary of the second lesson is that processing a string through multiple single REs is often better than one complex one. --traveler	[reply]