in reply to Regular expression to match an A=B type string needs help in storing matched parts of string

Note that your regex will match a bit faster if you change "+" to "+?":

if ( $gpa_ret =~ /[^=]+?[=][']([^']+)[']/ ) {

The speed-up can be significant if you have really long strings that match.

For your second question I'd like to note that I often find myself wanting @1 so that I could get the list of matches that a grouping matched!

        - tye (but my friends call me "Tye")
  • Comment on (tye)Re: Regular expression to match an A=B type string needs help in storing matched parts of string
  • Select or Download Code

Replies are listed 'Best First'.
RE: (tye)Re: Regular expression to match an A=B type string needs help in storing matched parts of string
by japhy (Canon) on Sep 21, 2000 at 21:56 UTC
    Why is that, tye? There's no reason to make the + non-greedy. There's no way for [^=] to match an =. Here's a benchmark:
    #!/usr/bin/perl use Benchmark 'timethese'; $short = "abcdefg"; $long = $short x 100; timethese(-5, { japhyS => q{ "$short=123" =~ /[^=]+=/ }, tyeS => q{ "$short=123" =~ /[^=]+?=/ }, japhyL => q{ "$long=123" =~ /[^=]+=/ }, tyeL => q{ "$long=123" =~ /[^=]+?=/ }, }); __END__ Benchmark: running japhyL, japhyS, tyeL, tyeS, each for at least 5 CPU seconds... japhyL: 5803.60/s (n=29018) tyeL: 1785.83/s (n=8947) japhyS: 30179.50/s (n=157537) tyeS: 26449.44/s (n=141240)
    It gets worse for longer strings.

    $_="goto+F.print+chop;\n=yhpaj";F1:eval

      Well, I was thinking of the string after the "=" being long, but the greedy version still wins. Perhaps the regex works from the back even when non-greedy?

      My (incorrect, apparently) thinking was that /[^=]+=/ will have to check the whole string to make sure there isn't a second "=" later on while /[^=]+?=/ could just stop at the first "=" (provided the rest of the regex matched). I'd be interested in any insights on this. [ Didn't I make this exact same mental error before... I'll have to go check and then double check the read-only tab on my brain ]

              - tye (but my friends call me "Tye")
        In the regex /[^=]+=/, the [^=] part matches NON-= characters. The = then matches an = sign after the non-= characters. That's basically all I can think to say...

        $_="goto+F.print+chop;\n=yhpaj";F1:eval