in reply to Puzzled by regex

When I saw the regex expression \S+?, my first thought was that this is equivalent to \S*. But it isn’t, as a little experimentation shows.

Consulting the Camel Book (4th Edition, page 214), I found that + means “1 or more times maximally” and +? means “1 or more times minimally.”

So, the difference between the two forms is not whether they match: if one matches, both must match. The difference lies only in what is matched, and this is relevant only if this part is captured (or, just possibly, if efficiency is an issue).

Hope that helps,

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Replies are listed 'Best First'.
Re^2: Puzzled by regex
by syphilis (Archbishop) on Apr 10, 2013 at 06:53 UTC
    The difference lies only in what is matched, and this is relevant only if this part is captured

    Well ... the regex does capture that part but afaics, when both regexes match they both match the same thing.
    Do you have an example that demonstrates this difference ?

    Just to be clear - I can see that /\S+?/ and /\S+/ could conceivably match differently, but I don't see how /__\S+?__\n/ and /__\S+__\n/ can match differently.
    (And it's important to me that I do understand how they match differently if, indeed, they can.)

    In case I'm guilty of not presenting the full picture, the regex (it's a split) as it appears in Inline.pm is actually:
    @{$DATA{$pkg}} = split /(?m)(__\S+?__\n)/, $data;
    Cheers,
    Rob