in reply to Re: Trouble capturing multiple groupings in regex
in thread Trouble capturing multiple groupings in regex

I've updated my post. I should've included examples of the %variable% tag being found in places where it is to be skipped, highlighting the importance of my bounding < and > in the original matching pattern. I have a main bounding area delimited by the >< chars, and a set of fields within that bounded by %'s.

  • Comment on Re^2: Trouble capturing multiple groupings in regex

Replies are listed 'Best First'.
Re^3: Trouble capturing multiple groupings in regex
by Corion (Patriarch) on Dec 09, 2015 at 15:14 UTC

    After thinking a bit more about this, the following approach using look-around assertion works:

    use warnings; use strict; while (<DATA>){ my @matches; @matches = (/(?<=[%>])%([^%]+)%(?=[%<])/g); print join ' ', @matches; print "\n"; } __DATA__ <span color="#231f20" someattr="%do_not_match%" textOverprint="false"> +%PN1%</span> <span color="#231f20" someattr="%do_not_match%" textOverprint="false"> +%DIMMM%%DIMINCH%</span> __END__ PN1 DIMMM DIMINCH

      That's it! Fantastic! Thanks so much. I'm going to finish up this report, and then figure out what exactly you did so I can understand this, but that worked absolutely perfectly. Here's hoping I don't come up with another strange scenario when I run this on 250 templates instead of 5. :)

        Here's hoping I don't come up with another strange scenario ...

        As has been noted many times in these precincts, attempting to parse XML with regexes is likely to lead you to the Hell of Exceptions. Maybe better to parse out the XML text bodies with an honest XML parser, then operate with regexes on the  %whatever% thingies therein?


        Give a man a fish:  <%-{-{-{-<