in reply to Re: Parse a large string
in thread Parse a large string

Consider what happens with the following string (disregarding any speeling misadventures errors and grammatical):

Nullamie a orci. Nullam quis augue. Aliquam lacinia tempus Praugue.

It is considered good practice to avoid using .* and .+ - they tend to be greedier than you often intend. Very often you are better to use a negated character class: [^.]+ would help a lot in this case. Also the word break anchor \b will help get intended behavior.


True laziness is hard work

Replies are listed 'Best First'.
Re^3: Parse a large string
by Marshall (Canon) on Mar 11, 2009 at 18:57 UTC
    Quite correct! As written the regex would match Nullamie as well as Nullam and the greediness would eat the first augue!

    Another way to calm greediness is the the ? modifier, .+ is a maximal match, .+? is a minimal match, like: $data =~ /(Nullam\b.+?(?:augue|libero)\.)/g); That's sometimes a good way to go and would work if we didn't have the "." to help us out here. Although I like your [^.]+ your idea looks great to me! There is more than one way to skin these regex cats!