Regular expressions have their uses. SGML parsing is not one of them. You've already found one of those situations where it just simply doesn't work. (also, try embeded tables). It's even worse when you try to deal with badly formatted HTML (and there's a whole lot of it out there, thanks to incorrectly written WYSIWYG editors and 'webmasters' who have no idea what HTML is).
Would you care to explain your reasons for not wanting to use existing parsers, as it's possible that there may be other ways to solve your problem.
(I'd personally try to build a tree, if I knew I was always going to be working with well formed SGML, but you haven't even mentioned why you're trying to do this)
In reply to Re: regexp text parsing issue.
by jhourcle
in thread regexp text parsing issue.
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |