Re^2: Parse a large string

Consider what happens with the following string (disregarding any speeling misadventures errors and grammatical):

Nullamie a orci. Nullam quis augue. Aliquam lacinia tempus Praugue.
[download]

It is considered good practice to avoid using .* and .+ - they tend to be greedier than you often intend. Very often you are better to use a negated character class: [^.]+ would help a lot in this case. Also the word break anchor \b will help get intended behavior.

True laziness is hard work

Comment on Re^2: Parse a large string Select or Download Code

Replies are listed 'Best First'.
Re^3: Parse a large string by Marshall (Canon) on Mar 11, 2009 at 18:57 UTC
Quite correct! As written the regex would match Nullamie as well as Nullam and the greediness would eat the first augue! Another way to calm greediness is the the ? modifier, .+ is a maximal match, .+? is a minimal match, like: $data =~ /(Nullam\b.+?(?:augue\|libero)\.)/g); That's sometimes a good way to go and would work if we didn't have the "." to help us out here. Although I like your `[^.]+` your idea looks great to me! There is more than one way to skin these regex cats!	[reply] [d/l]

Replies are listed 'Best First'.

Re^3: Parse a large string
by Marshall (Canon) on Mar 11, 2009 at 18:57 UTC

Another way to calm greediness is the the ? modifier, .+ is a maximal match, .+? is a minimal match, like: $data =~ /(Nullam\b.+?(?:augue|libero)\.)/g); That's sometimes a good way to go and would work if we didn't have the "." to help us out here. Although I like your [^.]+ your idea looks great to me! There is more than one way to skin these regex cats!

[reply]
[d/l]