Ok, here's my answer, spending about as much time understanding the code as the OP (you?) spent breaking down the issue:
Don't use regexes to lift content from HTML. Use a HTML parsing module. HTML::TokeParser::Simple is highly recommended, but other equally worthwhile ones also exist.
Makeshifts last the longest.
In reply to Re^3: Broken News- Reg. Exp.
by Aristotle
in thread Broken News- Reg. Exp.
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |