I tried using regular expression and sed to get rid of all <REF .......> elements so my desired output would look like this:</S></TEXT><TEXT><S Entail="142" s_id="0"> Annan urges return to democracy in <REF C-ENTID="Nepal" EXT="Nepal" ID +="104" SТYPE="PROPNAME">Nepal</REF></S><S Entail="138-139-142" +s_id="1"> UN Secretary General Kofi Annan on Tuesday expressed deep concern over + events in <REF A-CLASS="No-Reference" A-REFTYPE="Entity" C-ENTID="Ne +pal" EXT="Nepal" ID="105" SТYPE="PROPNAME">Nepal</REF> and urge +d a return to democracy, after <REF C-ENTID="King Gyanendra Bir Bikra +m" COMMENT="Coref direction is forward" EXT="King Gyanendra Bir Bikra +m" ID="100" SТYPE="APNAME">King Gyanendra Bir Bikram</REF> dismissed <REF A-CLASS="Entity-Entity" A-DIR="Backward" A-RELTYPE="Ide +ntity" A-RESTYPE="Intra" A-TYPE="Referential" ANT-ID="105" ID="101">t +he country</REF> 's coalition government and imposed an indefinite st +ate of emergency. </S><S Entail="138-139-143" s_id="2">
I had a sed line like this which does not work well :(</S></TEXT><TEXT><S Entail="142" s_id="0"> Annan urges return to democracy in Nepal</REF></S><S Entail="138-139-1 +42" s_id="1"> UN Secretary General Kofi Annan on Tuesday expressed deep concern over + events in Nepal</REF> and urged a return to democracy, after King Gy +anendra Bir Bikram</REF> dismissed the country</REF> 's coalition government and imposed an ind +efinite state of emergency. </S><S Entail="138-139-143" s_id="2">
Any idea how can I do it with perl? I do really appreciate :) Thankssed -r 's/<REF ([A-Za-z]*[-]{0,1}[A-Za-z]*=["].[A-Za-z0-9-]*.{0,1}[A-Z +a-z0-9-]*.{0,1}[A-Za-z0-9-]*.{0,1}[A-Za-z0-9-]*.{0,1}[A-Za-z0-9-]*.{0 +,1}[A-Za-z0-9-]*.{0,1}[A-Za-z0-9-]*.{0,1}[A-Za-z0-9-]*)*>//g' input.x +ml
In reply to problem with removing something in XML file by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |