2) It is possible there could be a bug if you have an old Perl. At any rate to stay sane, why not make up your own tag names like "\nSTART\t" and do a global replace on the data first. Then maybe you could read it yourself and have less trouble debugging.
3) Also you just don't want to use dot-star. Really. ".*?" is dangerous especially for finding things with quotes embedded in them, as that link (Ovid's) will show.
Ovid suggests a negated character class. You could also use an available HTML parser, or in the beginning just strip out all the bad stuff first (you need to know you are not stripping good data by accident). You could also inch through the data using pos to parse a bit at a time.
Move SIG!
In reply to Re: html tag matching confusion
by mattr
in thread html tag matching confusion
by moonlord
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |