in reply to Trouble capturing multiple groupings in regex

Another approach, with some attempt to make parsing more tolerant of variations in format:

c:\@Work\Perl>perl -wMstrict -le "use Data::Dump qw(dd); ;; my $start = qr{ > \s* }xms; my $more = qr{ \G (?<! \A) }xms; my $post = qr{ \s* <? }xms; ;; for my $s ( '<span c=\"#12\" foo=\"%DoNotMatch%\" bozz=\"false\">%PN1%</span>', '<span c=\"#98\" bar=\"false\" zot=\"%NoNoNo%\"> %DIMMM% %DIMINCH% +</span>', ) { print qq{'$s'}; my @matches = $s =~ m{ (?: $more | $start) % ([^%]+) % $post }xmsg; dd \@matches; } " '<span c="#12" foo="%DoNotMatch%" bozz="false">%PN1%</span>' ["PN1"] '<span c="#98" bar="false" zot="%NoNoNo%"> %DIMMM% %DIMINCH% </span>' ["DIMMM", "DIMINCH"]
Please see perlre, perlretut, and perlrequick. Caveat: any "pure regex" approach to parsing XML is fragile, probably very fragile.


Give a man a fish:  <%-{-{-{-<