in reply to Re: regex help or pointer to module needed
in thread regex help or pointer to module needed

Good suggestion.

Unfortunately I believe it will match on

<a href="page.html">Link Text</a>
I tried expanding this prior to seeking help here with something like:
m[ < ( [^>\s]+ ) > .*? </ (?! \1> ) ]x
I hoped the no-space condition would solve things. Unfortunately eBay and Amazon send emails that were caught.

Still all in all I think this expression along with a white list may be the direction I go for speed.

Good suggestion.