Re^2: Get chars between 2 markers using regular expressions

which is more accurate but slower (because it needs to look ahead).

According to this benchmark on my machine

use strict;
use warnings;
use Benchmark qw(:all);

my $string="He0Hello~~He2World~~";

sub invertedCharclass { $string=~m/He\d([^~]+)~~/g }

sub nonGreedy { $string=~m/He\d(.+?)~~/g }

cmpthese (-10,
    {
        '[^~]+' => \&invertedCharclass,
        '.+?' => \&nonGreedy,
    }
);
[download]

it performs like this:

          Rate [^~]+   .+?
[^~]+ 236537/s    --  -22%
.+?   303038/s   28%    --
[download]

which says, that the non-greedy matchall is even faster than the inverted character class.

Comment on Re^2: Get chars between 2 markers using regular expressions Select or Download Code

Replies are listed 'Best First'.
Re^3: Get chars between 2 markers using regular expressions by tirwhan (Abbot) on Dec 06, 2005 at 15:16 UTC
You're only matching the regex once, not collecting all instances of the match. The difference gets more pronounced the longer the string becomes: `use strict; use warnings; use Benchmark qw(:all); my $string="He0Hello~~He2W~orld~~He0Hello~~He2W~orld~~He0Hello~~He2W~o +rld~~He0Hello~~He2W~orld~~"; my $f; sub invertedCharclass { while($string=~m/He\d([^~]+)~~/g){$f=$1} } sub nonGreedy { while($string=~m/He\d(.+?)~~/g){$f=$1} } cmpthese (-10, { '[^~]+' => \&invertedCharclass, '.+?' => \&nonGreedy, } );` [download] gives this on my machine: `Rate .+? [^~]+ .+? 105088/s -- -28% [^~]+ 146676/s 40% --` [download] Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^3: Get chars between 2 markers using regular expressions
by tirwhan (Abbot) on Dec 06, 2005 at 15:16 UTC

You're only matching the regex once, not collecting all instances of the match. The difference gets more pronounced the longer the string becomes:

use strict;
use warnings;
use Benchmark qw(:all);

my $string="He0Hello~~He2W~orld~~He0Hello~~He2W~orld~~He0Hello~~He2W~o
+rld~~He0Hello~~He2W~orld~~";
my $f;

sub invertedCharclass { while($string=~m/He\d([^~]+)~~/g){$f=$1} }

sub nonGreedy { while($string=~m/He\d(.+?)~~/g){$f=$1} }

cmpthese (-10,
    {
        '[^~]+' => \&invertedCharclass,
        '.+?' => \&nonGreedy,
    }
);
[download]

          Rate   .+? [^~]+
.+?   105088/s    --  -28%
[^~]+ 146676/s   40%    --
[download]

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan

[reply]
[d/l]
[select]