Re: Regular Expressions

The problem with what you have is that regexes are greedy by default, so that .* is going to eat up every character up to the last > on the line. To prevent this, you can use the ? modifier after the * quantifier:

s/(<body\s.*?>)/$1$newContent/is
[download]

Another problem with your original regex is that it would miss a tag that spanned more than one line. Fixing this problem is the purpose of the /s modifier above.

But parsing HTML with regexes is unwise. Try something like HTML::Parser.

the lowliest monk

Comment on Re: Regular Expressions Download Code

Replies are listed 'Best First'.
Re^2: Regular Expressions by ww (Archbishop) on Jun 20, 2005 at 01:28 UTC
Just for the record, tlm added two 's' items... the first, which was lacking in OP's code, means "substituion." It's not optional.	[reply]
Re^2: Regular Expressions by Anonymous Monk on Jun 19, 2005 at 22:49 UTC
Thank you!	[reply]