in reply to Search and Replace in XML

XML::Twig should be suitable for your problem. It will be well worth your time to work through the examples in the tutorial.

In the future, it would be easier to understand your question if you posted your XML code inside 'code' tags. See Writeup Formatting Tips

Replies are listed 'Best First'.
Re^2: Search and Replace in XML
by Anonymous Monk on Sep 14, 2009 at 16:36 UTC
    but having something like this:
    <DOC> <S Sid='1'> <REF ID='1' TYPE='ANAPHOR' EXT='NAME'>he</REF>is hardworking guy. </S> <S Sid='2'> <REF ID='2' TYPE='ANAPHOR' EXT='FAMILY'>he</REF> </S> </DOC>
    how can I replace the REF value with it's @EXT value using XML::TWIG and output as the same xml just replacing those for all documents? Thanks again for your clue.
      Set up a Twig handler to find all the REF elements. Parse the XML file. In the handler routine, get the text of the EXT attribute. Set the text of the REF element to be the text of the EXT attribute. Print out the modified XML.

      After you have written some code, if you are still having trouble, post your code along with the expected output.

        TX for your reply. I have implemented some codes but I have problem for replace the REF element with EXT element. This is the original file

        <DOC id="AFP_ENG_20050316.0102" type="story"> <HEADLINE> Bobby Fischer can escape US if Iceland makes him citizen: Japanese lawmaker by Hiroshi Hiyama ATTENTION - ADDS quotes from immigration official, details /// </HEADLINE> <DATELINE> TOKYO, March 16 </DATELINE> <TEXT> <S Entail="28" s_id="0"> <REF YPE="PROPNAME">Bobby Fischer</REF>can escape US if Iceland makes +<REF ANT-ID="100" EXT="Bobby Fischer" ID="101">him</REF> citizen: Japanese lawmaker by Hiroshi Hiyama ATTENTION - ADDS quotes from immigration official, details /// </S><S Entail="28-31" s_id="1"> Chess legend <REF ID="102" YPE="PROPNAME">Bobby Fischer</REF>, who fa +ces prison if <REF ANT-ID="102" EXT="Bobby Fischer" ID="103" YPE="PRO +N">he</REF> returns to the United States, can only avoid deportation +from Japan if <REF ID="104" YPE="PROPNAME">Iceland</REF> upgrades <REF ANT-ID="104" + EXT="Iceland's" ID="105">its</REF> granting of residency to full cit +izenship, a Japanese lawmaker said Wednesday. </S> </TEXT> </DOC>

        and my expected file is mentioned as below .some of the REF tags do not have EXT ,for that cases there is not replacment

        <DOC id="AFP_ENG_20050316.0102" type="story"> <HEADLINE> Bobby Fischer can escape US if Iceland makes him citizen: Japanese lawmaker by Hiroshi Hiyama ATTENTION - ADDS quotes from immigration official, details /// </HEADLINE> <DATELINE> TOKYO, March 16 </DATELINE> <TEXT> <S Entail="28" s_id="0"> <REF ID="100" YPE="PROPNAME">Bobby Fischer</REF>can escape US if Icela +nd makes <REF ANT-ID="100" EXT="Bobby Fischer" ID="101">Bobby Fische +r</REF> citizen: Japanese lawmaker by Hiroshi Hiyama ATTENTION - ADDS quotes from immigration official, details /// </S><S Entail="28-31" s_id="1"> Chess legend <REF ID="102" YPE="PROPNAME">Bobby Fischer</REF>, who fac +es prison if <REF ANT-ID="102" EXT="Bobby Fischer" ID="103" YPE="PRO +N">Bobby Fischer</REF> returns to the United States, can only avoid d +eportation from Japan if <REF ID="104" >Iceland</REF> upgrades <REF ANT-ID="104" EXT="Iceland' +s" ID="105">Iceland's</REF> granting of residency to full citizenship +, a Japanese lawmaker said Wednesday. </S> </TEXT> </DOC>

        my code is

        #!/bin/perl -w use strict; use XML::Twig; my $twig=new XML::Twig(twig_roots => {'TEXT' =>1} ,twig_handlers=>{'R +EF' => \&REF}); # change the address of root to TEXT element because +REF is the children of TEXT my $field='REF'; $twig->parsefile("AFP_ENG_20050316.0102.xml"); #Parse the file my $root=$twig->root; my $rootchild=$root->children; sub REF { my ($twig ,$field)=@_; my $extension = $field->text;#keep the value of REF element my $att=$field->att('EXT'); #keep the value of Ext ( some REF elemen +ts do not have EXT attribute) print "$extension\n"; $att->set_text($field); $extension=$att; } $twig->print;

        I would like to know is there function in twig to remove the REF elements after this replacement but keep the value of them in sentence? thanks a lot.