Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: problem with removing something in XML file

by Sandy (Curate)
on Sep 18, 2009 at 16:47 UTC ( [id://796161]=note: print w/replies, xml ) Need Help??


in reply to problem with removing something in XML file

Normally, one should take the advice of previous suggestions before demanding more answers, but... nonetheless...

Don't know why your regular expression is so complicated.

Assuming that all <REF > statements are always complete on a single line...

XML File Before

</S></TEXT><TEXT><S Entail="142" s_id="0"> Annan urges return to democracy in <REF C-ENTID="Nepal" EXT="Nepal" ID +="104" S&#1058;YPE="PROPNAME">Nepal</REF></S> <S Entail="138-139-142" s_id="1"> UN Secretary General Kofi Annan on Tuesday expressed deep concern over + events in <REF A-CLASS="No-Reference" A-REFTYPE="Entity" C-ENTID="Nepal" EXT="Ne +pal" ID="105" S&#1058;YPE="PROPNAME">Nepal</REF> and urged a return to democracy, after <REF C-ENTID="King Gyanendra Bir Bikram" COMMENT="Coref direction is f +orward" EXT="King Gyanendra Bir Bikram" ID="100" S&#1058;YPE="APNAME" +> King Gyanendra Bir Bikram</REF> dismissed <REF A-CLASS="Entity-Entity" A-DIR="Backward" A-RELTYPE="Ide +ntity" A-RESTYPE="Intra" A-TYPE="Referential" ANT-ID="105" ID="101"> the country</REF> 's coalition government and imposed an indefinite st +ate of emergency. </S><S Entail="138-139-143" s_id="2">
perl one-liner (on DOS)
perl -pibak -e "s/<\/?REF.*?>//ig" junk.txt
Result:
</S></TEXT><TEXT><S Entail="142" s_id="0"> Annan urges return to democracy in Nepal</S> <S Entail="138-139-142" s_id="1"> UN Secretary General Kofi Annan on Tuesday expressed deep concern over + events in Nepal and urged a return to democracy, after King Gyanendra Bir Bikram dismissed the country 's coalition government and imposed an indefinite state of + emergency. </S><S Entail="138-139-143" s_id="2">
Sandy

UPDATE: Also assumes that there are no embedded ">" inside the REF tag

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://796161]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (6)
As of 2024-04-25 11:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found