Attempting to parse or manipulate HTML (or XML or similar) with a regular expression is nearly always a bad idea. I'm actually working at the moment and don't have time to pull out references; however, I'm sure others will do so.
Surprised I don't have a list of references on this topic. Here's a start:
References Added Later
See Also
In reply to Re^2: Substitution remove all before (Parse HTML/XML with Regex References)
by eyepopslikeamosquito
in thread Substitution remove all before
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |