does your data always look like pseudo-HTML? tagged? maybe you can have better results with HTML::TokeParser
here's some untested code:
my $p = HTML::TokeParser->new($html) || die "Can't tokenize: $!"; # get each <tag1> alone. while (my $token = $p->get_tag('tag1')) { # store the original text $origtext = $token->[3]; # get data between <MYTAG></MYTAG> my $myTag = $p->get_tag('MYTAG'); my $text = $p->get_text('/MYTAG'); if ($text ne '') { # tag is not empty.. so $origtext retains # the data we want.. # ... do whatever with $origtext and move on } }
In reply to Re: Re: mutiple-line regexes?
by Chady
in thread mutiple-line regexes?
by jens
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |