in reply to $1 and regex

Consider how   m!(\d+).+to.+(\d+)!i matches against   123to456 Since you're using greedy matching, each + will match as much as it can. The regexp matches like this:
(12)3to45(6) $1 $2
That's not what you intend. A first cut at fixing this is to rewrite the regex so that it it won't gobble up extra digits on either side of "to".   m!(\d+)\D+to\D+(\d+)!i; This will match target strings that have non-digit substrings surrounding the "to", but won't match "123to456", since there are no non-digit characters surrounding "to". If that's a problem, you can take the regex a step further, and write   m!(\d+)\D*?to\D*?(\d+)!i; which will accept zero or more non-digit characters on either side of "to".

Replies are listed 'Best First'.
Re: Re: $1 and regex
by sauoq (Abbot) on Aug 27, 2002 at 18:23 UTC
    m!(\d+)\D*?to\D*?(\d+)!i;

    Why not just:

    m!(\d+)\D*to\D*(\d+)!i
    ?
    There's no reason to make the \D*s non-greedy, is there?

    -sauoq
    "My two cents aren't worth a dime.";
    
      There's no reason to make the \D*s non-greedy, is there?

      When in doubt, make your regexes non-greedy. You'll stay out of a lot of trouble that way.

        When in doubt, make your regexes non-greedy. You'll stay out of a lot of trouble that way.

        Wow. I found this answer disappointing, dws. There are too many beginners who, once they learn about minimal matching, use it far too often.

        The common example is using a non-greedy quantifier instead of greedily matching a negated character class. For example, /"[^"]*"/ is much better than using /".*?"/ to do the same thing. I should probably re-read MRE again as it has been 3 or 4 years but I think it illustrates that removing beginning and trailing whitespace with s/^\s*(.*?)\s*$/$1/ is several times slower than s/^\s*//; s/\s*$//; is.

        Revisiting the original question, I suspect a mixed solution like, /(\d+)\D*?to\D*(\d+)/ would actually be better, depending, of course, on whether you defined better as shorter, faster, or easier to understand. I'm not sure though. Like I said, it's been too long since I've read MRE. I'm sure someone here could give us a concise analysis.

        I am interested in understanding why you suggest the rule of thumb that you do but I think I'll still agree with Arien on this point. You'll only stay out of trouble by not being in doubt.

        -sauoq
        "My two cents aren't worth a dime.";
        
        When in doubt, make your regexes non-greedy. You'll stay out of a lot of trouble that way.

        You will steer clear of some traps if you do that, but just to fall into others: instead of sometimes matching too much, you will sometimes match too little.

        There is no magic cure for lack of knowledge or understanding. When in doubt, check the documentation and think.

        — Arien