in reply to Re: Re: Capturing everything after an optional character in a regex?
in thread Capturing everything after an optional character in a regex?

Your original regex was m/X?(\S+)/

The problem is that the + quantifier is greedier than ?, and will thus, try to match as many characters as possible. Since the X is optional, due to the ? quantifier, X? is yielding to the \S+ portion of your pattern, so that \S+ matches everything even if there is an X that could have matched X?.

You may be able to get around that problem as simply as by specifying non-greedy matching for the \S+ portion of the regex. In fact, that might be a better solution than the others I've suggested later in this thread. However, I tend to like to spell things out more clearly than simply making something non-greedy and hoping for the best. My later suggestions force \S+ to give up something, whereas specifying non-greediness just weights the tug-of-war.

Nevertheless, specifying non-greed might just be the simplest approach to your problem, so here it is (untested):

m/X?(\S+?)$/

Updated: As another Anonymous Monk pointed out, forcing non-greed in the \S+ portion of the regex doesn't help, and thus, the answers I've posted lower in this thread are preferable over the one I've striked out in this node. Or Roger's answer, which allows either case to be captured by the same set of parens, negating the need to count capturing parens. Anon is right though, X? being optional makes \S+ (and \S+?) rob the X from X?


Dave

Replies are listed 'Best First'.
Re: Re: Re: Re: Capturing everything after an optional character in a regex?
by Anonymous Monk on Dec 04, 2003 at 07:32 UTC

    The greediness of \$+ has nothing to do with the observed behavior, and making it non-greedy doesn't help the situation. The "problem" is strictly the optional nature of the X?.

blame it on greedy, ignore remaining Dwarves
by Anonymous Monk on Dec 05, 2003 at 21:16 UTC
    This is slightly OT, but, I have to ask... why does greediness get the blame for so much? I am not an expert in RE engines, but I am pretty sure that "leftmostness" trumps greediness nearly every time. Correct? ie: "leftmost" match always succeeds before the "best" match, or "biggest" match.

    \S+'s greediness doesn't really figure into this problem in the very least, as far as I can tell. Greediness is right-acting, not omni-directional.