in reply to Re^4: Negating a regexp
in thread Negating a regexp

That looks like XML. I'd seriously consider using XML::Twig!

If you need some help show us some representative data and what you actually want to extract


DWIM is Perl's answer to Gödel

Replies are listed 'Best First'.
Re^6: Negating a regexp
by jxh (Acolyte) on Feb 01, 2006 at 10:42 UTC
    It is of a fashion - impossibly malformed XML. The data now given above ( and specified output ) is now fully representative, this tool is effectivley a parser to take some poor output, to poor, but well-formed XML. Although Im keen to continue with this approach, is the solution on a differant approach ?

      Actually that is a lot simpler to solve assuming one "element" per line:

      use warnings; use strict; while (my $line = <DATA>) { # Assume one "element" per line if ($line =~ /<(\w+)>(.*?)<\/\1>/) { (my $midstr = $2) =~ tr/<>/[]/; $line = "<$1>$midstr</$1>\n"; } print $line; } __DATA__ <sometag> my data here </sometag> <anothertag> further text </anothertag> <furthertag> not good as contains <this> in angle brackets </furtherta +g> <byetag> another possibility is <more> <than> one <angle pairs> in her +e </byetag>

      Prints:

      <sometag> my data here </sometag> <anothertag> further text </anothertag> <furthertag> not good as contains [this] in angle brackets </furtherta +g> <byetag> another possibility is [more] [than] one [angle pairs] in her +e </byetag>

      DWIM is Perl's answer to Gödel
        Typically it can cover more than one line, however I can easily modify your example myself to accomodate this. In terms of looking to achieve this with a single substitution command, I think this isnt the approach to take - a little code as per your example is the way to handle it. Many thanks for the options explored throughout.