Selvakumar has asked for the wisdom of the Perl Monks concerning the following question:

Hi All,
I have a requirement to convert xml file to xml file. Here i need to process a lot not only tag names changed here but i need to get attribute and process with some calculation and from the comment i need to get some details and insert in to an attribute, convert CMYK values to RGB values for an attribute, like lot of calculation i need to do.
I need your valuable suggestion to convert this process. Initially i thought to convert from xslt but since lot of calculation part and script process required i need some suggestion to do this.
The file size to convert is very huge since i am unable to upload here.
For example:

<div class="P_FM_EPG_INTRO_FIRST" pstyID="u15d" cstyID="u62">

to convert to
<paragraph class="PFMEPGINTROFIRST" style="FM_EPG_INTRO_FIRST">

Replies are listed 'Best First'.
Re: xml to xml with script process
by Sewi (Friar) on Oct 05, 2009 at 07:01 UTC
    There are many XML modules on CPAN.

    It might be even easier if you seperate this into two steps:

  • First read the XML into a hash (for example via http://search.cpan.org/~msergeant/XML-Parser-2.36/Parser.pm)
  • Walk through the hash, because it's a Perl native data structure which should be easy to convert or to build a new hash from.
  • Now write your new hash to disk. I didn't ever do this using a module, but there should be some on CPAN.
Re: xml to xml with script process
by mickep76 (Beadle) on Oct 05, 2009 at 07:28 UTC

    Hi

    I would recommend using Twig it is quite easy and very flexible. I did some similar work and looked at several CPAN modules but in the end Twig seems to be one of the easiest and most flexible solutions out there.

      Using Twig you could do something like below.

      use strict; use XML::Twig; $file = 'test.xml'; my $twig = XML::Twig->new( twig_handlers => { div => \&div_handler }, pretty_print => 'indented' ); $twig->parsefile($file); $twig->flush(); sub div_handler { my ($twig, $div)= @_; if($div->{'att'}->{'class'} eq "P_FM_EPG_INTRO_FIRST") { $div->set_tag('paragraph'); $div->set_att(class => "PFMEPGINTROFIRST"); $div->set_att(style => "FM_EPG_INTRO_FIRST"); } }

        Hi mickep76,
        Thanks for your reply. I am facing data loss some time while executing my script. is there any way to find the data loss.

        I am using the below code.
        use strict; use warnings; use XML::Parser; use XML::Twig; my $xmlfile = 'input1.xml'; # the file to parse # initialize parser object and parse the string my $parser = XML::Parser->new( ErrorContext => 2 ); eval { $parser->parsefile( $xmlfile ); }; # report any error that stopped parsing, or announce success if( $@ ) { $@ =~ s/at \/.*?$//s; # remove module line number print STDERR "\nERROR in '$xmlfile':\n$@\n"; exit; } my $twig=XML::Twig->new(twig_handlers=>{ # indd_document=>\&root_process +, # div => \&div_handler, # content => \&content_handler, }); $twig->parsefile( $xmlfile); open(FH,">output.xml") or die "cannot open output.xml: $!"; my $temp = $twig->toString; print FH $temp; close (FH); $twig->purge; #root tag "indd_document" process here sub root_process { my( $twig, $root)= @_; $root->set_tag( 'document'); $root->set_atts({OutputCreationDate=>'20090510', OutputCreationT +ime=>'00:00:00', 'xml:space'=>"preserve"}); $root->purge; } #paragraph element process goes here sub div_handler { my ($twig, $div)= @_; $div->set_tag('paragraph'); $div->del_att('cstyID','pstyID'); my $attval=$div->att('class'); $attval=~s/P_//; $div->set_att(style => $attval); $attval=~s/_//g; $attval="P".$attval; $div->set_att(class =>$attval); $div->purge; } #content tag handler sub content_handler { my ($twig, $content)= @_; #For textbox element if($content->{'att'}->{'type'} eq "text") { $content->set_tag('textbox'); $content->del_atts; $content->set_atts({bid=>'', aid=>'', att=>'', pgnbr=>'', pgsect=> +'', spp=>'', spf=>'', top=>'', left=>'', height=>'', bh=>'', width=>' +', bw=>'', colums=>'', gutter=>'', ts=>'', ls=>'', bs=>'', rs=>'', t= +>'', l=>'', b=>'', r=>'', tit=>'', til=>'', tib=>'', tir=>'', color=> +'', cn=>'', framewidth=>'', framecolor=>'', fcn=>'', grp=>''}); } $content->purge; }
Re: xml to xml with script process
by Jenda (Abbot) on Oct 06, 2009 at 15:54 UTC

    Sounds like a job for XML::Rules. There's quite a few examples either here or included with the module.

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.