Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

The below code creates the XML file. I want replace only first <p> tag with <con><p> and last </p> with </con></p>
#!/usr/bin/perl my $tag; my $output; my $fh; while (<DATA>) { chomp; if(/^\.\.(.*):$/) # match line { $fh = sub_output($output, $tag, $fh); $output = ""; $tag = $1; print $tag; } else { # not a {TAG} line next unless($tag); next if(/^\s*$/); $output .= ($output) ? " $_" : "<$tag>$_"; } } # End of While Loop $fh = sub_output($output, $tag, $fh); if($fh) { print $fh "</root>\n"; close($fh); } exit(0); # Subroutine to open the file with the filename as DN sub sub_output { my ($output, $tag, $fh) = @_; if($output) { if($output =~ m/<DN>(.*)/) { if($fh) { print $fh "</root>\n"; close($fh); } open($fh, '>', "$1.xml") or die "$1.xml: $!"; print $fh "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"; } $text = "$output</$tag>"; $text =~ s/<DN>.*<\/DN>//; # Here i should substitute <con> for first <p> tag and + <p></con> print $fh "$text>\n"; } return($fh); } # End of sub sroutine __DATA__ ..DN: 1 ..id: 000044119 ..DD: Friday, October 30, 2009 ..p: THis is param1 ..p: THis is param2 ..p: THis is param3 ..DN: 2 ..id: 000044119 ..DD: Friday, October 30, 2009 ..p: THis is param1 ..p: THis is param2 ..p: THis is param3
In the xml file how to get the output as
<id>000044119</id> <DD>Friday, October 30, 2009</DD> <con><p>THis is param1</p> <p>THis is param2</p> <p>THis is param3</p></con>

Replies are listed 'Best First'.
Re: replace the first tag
by Your Mother (Archbishop) on Nov 06, 2009 at 05:20 UTC

    ikegami will probably come along and embarrass me by doing this in 4 lines but here you go. You can easily skip the DN elements if you choose or do a separate XML doc for each, et cetera, with what I hope will become obvious tweaking. XML with regexes is a land of tears. Just say no. Reading: XML::LibXML. Update: also, ->toFile in XML::LibXML::Document to easily put the DN ids out to files though the necessary tweaks to the code to create a doc per DN are left to you. :)

    use strict; use warnings; use XML::LibXML; my $doc = XML::LibXML::Document->new( "1.0", "UTF-8" ); my $root = $doc->createElement("root"); $doc->setDocumentElement( $root ); while ( my $ident = <DATA> ) { $ident =~ s/\A\.\.|:.*//gs; chomp( my $value = <DATA> ); my $node = $doc->createElement($ident); $node->addChild($doc->createTextNode($value)); if ( $ident eq "p" ) { my $last = $root->lastChild; if ( $last->nodeName eq "con" ) { $last->addChild($node); } else { my $con = $doc->createElement("con"); $root->addChild($con); $con->addChild($node) } } else { $root->addChild($node); } } print $doc->serialize(1); __DATA__ ..DN: 1 ..id: 000044119 ..DD: Friday, October 30, 2009 ..p: THis is param1 ..p: THis is param2 ..p: THis is param3 ..DN: 2 ..id: 000044119 ..DD: Friday, October 30, 2009 ..p: THis is param1 ..p: THis is param2 ..p: THis is param3

    Output-

    <?xml version="1.0" encoding="UTF-8"?> <root> <DN>1</DN> <id>000044119</id> <DD>Friday, October 30, 2009</DD> <con> <p>THis is param1</p> <p>THis is param2</p> <p>THis is param3</p> </con> <DN>2</DN> <id>000044119</id> <DD>Friday, October 30, 2009</DD> <con> <p>THis is param1</p> <p>THis is param2</p> <p>THis is param3</p> </con> </root>
    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: replace the first tag
by Jenda (Abbot) on Nov 06, 2009 at 14:27 UTC

    Erm ... maybe your code could remember whether it already printed a <p> tag or not and print the "<con>" only the first time. And then print "</con>" at the very end of the file. Maybe?

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.