Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
I tried using regular expression and sed to get rid of all <REF .......> elements so my desired output would look like this:</S></TEXT><TEXT><S Entail="142" s_id="0"> Annan urges return to democracy in <REF C-ENTID="Nepal" EXT="Nepal" ID +="104" SТYPE="PROPNAME">Nepal</REF></S><S Entail="138-139-142" +s_id="1"> UN Secretary General Kofi Annan on Tuesday expressed deep concern over + events in <REF A-CLASS="No-Reference" A-REFTYPE="Entity" C-ENTID="Ne +pal" EXT="Nepal" ID="105" SТYPE="PROPNAME">Nepal</REF> and urge +d a return to democracy, after <REF C-ENTID="King Gyanendra Bir Bikra +m" COMMENT="Coref direction is forward" EXT="King Gyanendra Bir Bikra +m" ID="100" SТYPE="APNAME">King Gyanendra Bir Bikram</REF> dismissed <REF A-CLASS="Entity-Entity" A-DIR="Backward" A-RELTYPE="Ide +ntity" A-RESTYPE="Intra" A-TYPE="Referential" ANT-ID="105" ID="101">t +he country</REF> 's coalition government and imposed an indefinite st +ate of emergency. </S><S Entail="138-139-143" s_id="2">
I had a sed line like this which does not work well :(</S></TEXT><TEXT><S Entail="142" s_id="0"> Annan urges return to democracy in Nepal</REF></S><S Entail="138-139-1 +42" s_id="1"> UN Secretary General Kofi Annan on Tuesday expressed deep concern over + events in Nepal</REF> and urged a return to democracy, after King Gy +anendra Bir Bikram</REF> dismissed the country</REF> 's coalition government and imposed an ind +efinite state of emergency. </S><S Entail="138-139-143" s_id="2">
Any idea how can I do it with perl? I do really appreciate :) Thankssed -r 's/<REF ([A-Za-z]*[-]{0,1}[A-Za-z]*=["].[A-Za-z0-9-]*.{0,1}[A-Z +a-z0-9-]*.{0,1}[A-Za-z0-9-]*.{0,1}[A-Za-z0-9-]*.{0,1}[A-Za-z0-9-]*.{0 +,1}[A-Za-z0-9-]*.{0,1}[A-Za-z0-9-]*.{0,1}[A-Za-z0-9-]*)*>//g' input.x +ml
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: problem with removing something in XML file
by marto (Cardinal) on Sep 18, 2009 at 14:28 UTC | |
by Anonymous Monk on Sep 18, 2009 at 14:32 UTC | |
by marto (Cardinal) on Sep 18, 2009 at 14:42 UTC | |
|
Re: problem with removing something in XML file
by Sandy (Curate) on Sep 18, 2009 at 16:47 UTC | |
|
Re: problem with removing something in XML file
by graff (Chancellor) on Sep 19, 2009 at 02:01 UTC | |
|
Re: problem with removing something in XML file
by mirod (Canon) on Sep 19, 2009 at 11:14 UTC | |
| A reply falls below the community's threshold of quality. You may see it by logging in. |