ngbabu has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, My XML file is like below:
<reference> ... ... <title type="journal">Harv L R</title> .... .... </reference>

My INI file info is like below:

Harv L R<TAB>Harvard Law Review MLR<TAB>Modern Law Review

The required output is as follow:

<reference> ... ... <title type="journal">Harvard Law Review</title> .... .... </reference>

The <title> is present in other locations also. We have to modify only titles which are child to <reference> and type="journal".

My code is like below:

use XML::XPath; use XML::XPath::XMLParser; $xp = XML::XPath->new(filename => 'mlr_648.xml'); $nodeset = $xp->find('//reference/title[@type="journal"]'); foreach my $node ($nodeset->get_nodelist) { $line=XML::XPath::XMLParser::as_string($node); if($line =~m!<title type="journal">(.*)</title>!) { $jb = $1; } &rep($jb); } sub rep { $jabb = $jb; open(DAT, "mlr.dat"); $/=undef; $x=<DAT>; while($x=~m!(.*)\t(.*)!ig) { %data = ("$1" => "$2"); $jname = $data{$jabb}; if($jname ne "") { print "<title type=\"journal\">$jname</title>\n"; } } }

I am getting required output. but I am unable to print entire XML with the modification in a separate file. Please help in two cases.

1. Print an error message if the corresponding full name is not found in the ini file

2. Print entire XML to a separate file.

Regards, Ganesh

Replies are listed 'Best First'.
Re: Modifying a Sepcifig node
by ikegami (Patriarch) on Oct 01, 2008 at 05:20 UTC
    Sounds like a perfect job for XML::Twig.
    use strict; use warnings; use XML::Twig qw( ); my $fqn_dat = 'mlr.dat'; my $fqn_in = 'mlr_648.xml'; my $fqn_out = 'fixed_mlr_648.xml'; my %translations; { open(my $fh, '<', $fqn_dat) or die("Can't open \"$fqn_dat\": $!\n"); while (<$fh>) { /^(.*?)\t(.*)/ or next; $translations{$1} = $2; } } open(my $fh_in, '<', $fqn_in) or die("Can't open input file \"$fqn_in\": $!\n"); open(my $fh_out, '>', $fqn_out) or die("Can't create output file \"$fqn_out\": $!\n"); my $twig = XML::Twig->new( twig_handlers => { 'reference/title[@type="journal"]' => sub { my ($twig, $elt) = @_; my $text = $elt->text(); if ( exists( $translations{$text} ) ) { $elt->set_text( $translations{$text} ); } else { warn("No translation found for \"$text\"\n"); } }, 'reference' => sub { my ($twig, $elt) = @_; # Keep at most one <reference> in memory. $twig->flush($fh_out); }, } ); $twig->parse($fh_in); $twig->flush($fh_out);

      Hi All,

      The above code is perfectly working except one problem. I do not know why it is happening like this. My problem is &apos; is getting changed to '(its character). But I want &apos; in the output

      Please help in getting the &apos; in the output

      Regards

      Ganesh

        It doesn't really matter, but if you insist, try keep_encoding option.