Re^4: Negating a regexp

Thanks to everyone for working solutions provided. To further refine the OP, as I tried to simplify for the sake of clarity : Consider :


#!/usr/bin/perl -w

use strict;

while ( <DATA> ) {


        s/ angle brackets between element pairs / instead make square 
+bracketed /


        print $_ . "\n";


}

__DATA__

<sometag> my data here </sometag>
<anothertag> further text </anothertag>
<furthertag> not good as contains <this> in angle brackets </furtherta
+g>
<byetag> another possibility is <more> <than> one <angle pairs> in her
+e </byetag>
[download]

As I can no longer specify (more|less) as fixed patterns ( as per ikegami hunch ) Im again struggling to apply what is demonstrated to my actual data... Required output :

<sometag> my data here </sometag>
<anothertag> further text </anothertag>
<furthertag> not good as contains [this] in angle brackets </furtherta
+g>
<byetag> another possibility is [more] [than] one [angle pairs] in her
+e </byetag>
[download]

Comment on Re^4: Negating a regexp Select or Download Code

Replies are listed 'Best First'.
Re^5: Negating a regexp by GrandFather (Saint) on Feb 01, 2006 at 10:34 UTC
That looks like XML. I'd seriously consider using XML::Twig! If you need some help show us some representative data and what you actually want to extract DWIM is Perl's answer to Gödel	[reply]
Re^6: Negating a regexp by jxh (Acolyte) on Feb 01, 2006 at 10:42 UTC
It is of a fashion - impossibly malformed XML. The data now given above ( and specified output ) is now fully representative, this tool is effectivley a parser to take some poor output, to poor, but well-formed XML. Although Im keen to continue with this approach, is the solution on a differant approach ?	[reply]
Re^7: Negating a regexp by GrandFather (Saint) on Feb 01, 2006 at 10:57 UTC
Actually that is a lot simpler to solve assuming one "element" per line: `use warnings; use strict; while (my $line = <DATA>) { # Assume one "element" per line if ($line =~ /<(\w+)>(.*?)<\/\1>/) { (my $midstr = $2) =~ tr/<>/[]/; $line = "<$1>$midstr</$1>\n"; } print $line; } __DATA__ <sometag> my data here </sometag> <anothertag> further text </anothertag> <furthertag> not good as contains <this> in angle brackets </furtherta +g> <byetag> another possibility is <more> <than> one <angle pairs> in her +e </byetag>` [download] Prints: `<sometag> my data here </sometag> <anothertag> further text </anothertag> <furthertag> not good as contains [this] in angle brackets </furtherta +g> <byetag> another possibility is [more] [than] one [angle pairs] in her +e </byetag>` [download] DWIM is Perl's answer to Gödel	[reply] [d/l] [select]
Re^8: Negating a regexp by jxh (Acolyte) on Feb 01, 2006 at 11:04 UTC
Re^5: Negating a regexp by BrowserUk (Patriarch) on Feb 01, 2006 at 11:16 UTC
This violates your "no code" premise, but if your tags are all on one line, this seems reasonably robust and should be fairly efficient. `#! perl -sw use strict; while( <DATA> ) { s[^(<([^>]+?)>)(.*)(</\2>)]{ (my $x = $3) =~ tr[<>][[]]; "$1$x$4"; }e; print; } __DATA__ <sometag> my data here </sometag> <anothertag> further text </anothertag> <furthertag> not good as contains <this> in angle brackets </furtherta +g> <byetag> another possibility is <more> <than> one <angle pairs> in her +e </byetag> <a tag with spaces> and content containing a false </a tag with spaces +> and a real </a tag with spaces>` [download] Produces `C:\Perl\test>junk2 <sometag> my data here </sometag> <anothertag> further text </anothertag> <furthertag> not good as contains [this] in angle brackets </furtherta +g> <byetag> another possibility is [more] [than] one [angle pairs] in her +e </byetag> <a tag with spaces> and content containing a false [/a tag with spaces +] and a real </a tag with spaces>` [download] Of course, it doesn't attempt to deal with attributes, or multi-line elements or nested tags or any of that good stuff. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l] [select]
Re^6: Negating a regexp by jxh (Acolyte) on Feb 02, 2006 at 11:09 UTC
Thanks - Considering all the responses, I think attempting to do this +without+ a small amount of code, ie an a single 'replace' expression is not the best approach. Thanks for this solution.	[reply]