in reply to (tye)Re: Regular expression to match an A=B type string needs help in storing matched parts of string
in thread Regular expression to match an A=B type string needs help in storing matched parts of string

Why is that, tye? There's no reason to make the + non-greedy. There's no way for [^=] to match an =. Here's a benchmark:
#!/usr/bin/perl use Benchmark 'timethese'; $short = "abcdefg"; $long = $short x 100; timethese(-5, { japhyS => q{ "$short=123" =~ /[^=]+=/ }, tyeS => q{ "$short=123" =~ /[^=]+?=/ }, japhyL => q{ "$long=123" =~ /[^=]+=/ }, tyeL => q{ "$long=123" =~ /[^=]+?=/ }, }); __END__ Benchmark: running japhyL, japhyS, tyeL, tyeS, each for at least 5 CPU seconds... japhyL: 5803.60/s (n=29018) tyeL: 1785.83/s (n=8947) japhyS: 30179.50/s (n=157537) tyeS: 26449.44/s (n=141240)
It gets worse for longer strings.

$_="goto+F.print+chop;\n=yhpaj";F1:eval
  • Comment on RE: (tye)Re: Regular expression to match an A=B type string needs help in storing matched parts of string
  • Download Code

Replies are listed 'Best First'.
RE: RE: (tye)Re: Regular expression to match an A=B type string needs help in storing matched parts of string
by tye (Sage) on Sep 21, 2000 at 22:12 UTC

    Well, I was thinking of the string after the "=" being long, but the greedy version still wins. Perhaps the regex works from the back even when non-greedy?

    My (incorrect, apparently) thinking was that /[^=]+=/ will have to check the whole string to make sure there isn't a second "=" later on while /[^=]+?=/ could just stop at the first "=" (provided the rest of the regex matched). I'd be interested in any insights on this. [ Didn't I make this exact same mental error before... I'll have to go check and then double check the read-only tab on my brain ]

            - tye (but my friends call me "Tye")
      In the regex /[^=]+=/, the [^=] part matches NON-= characters. The = then matches an = sign after the non-= characters. That's basically all I can think to say...

      $_="goto+F.print+chop;\n=yhpaj";F1:eval