Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks

Im new to Perl and started to learn it very recently. Im working with XML files and I was told my best friend in this case would be XML::LibXML. I did use it to first parse and play around with my XML files. Its pretty neat and easy.

However, what I need to do now is first create a new XML file then copy XML elements over from another XML file. I looked for few examples on line and found that creating and XML file would be achieved using

 XML::LibXML::Document->new('1.0', 'utf-8')

then I can use createAttribute and createElement ... etc. to construct my file.
what im trying to achieve should look like the following example

<Key="1234"> ***XML ELEMENTS HERE FROM ANOTHER FILE0*** <Status="in use"/> <Features> <File1> ***XML ELEMENTS HERE FROM ANOTHER FILE1*** </File1> <File2> ***XML ELEMENTS HERE FROM ANOTHER FILE2*** </File2> <File3> ***XML ELEMENTS HERE FROM ANOTHER FILE3*** </File3> </Features> <Other_Status/> </Key>

The approach I have in mind is a loop through all my source files create my new elements and while doing that I copy source elements over accordingly. Then copy from the next file so on as so forth. I just don't know how to copy the element itself. my XML files are hundreds of lines otherwise I could have used "findnodes" get my values and create my elements again from scratch in the new file. Maybe I can still use it here I don't know and that's what im looking for, what is the best Perl module I can stick with for such job? Or what is the best approach to achieve such result. Some code sample would be appreciated.
my source files are all like this, all have the same structure and share the element <ABC id=" ">:

<doc> <ABC id="1"> <Feature> <Number>86839</Number> <Prefix>419</Prefix> <Alt>73924/Alt> </Feature> </ABC> <ABC id="2"> <Feature> <Number>82783</Number> <Prefix>826</Prefix> <Alt>27800</Alt> </Feature> </ABC> <ABC id="3"> <Feature> <Number>82783</Number> <Prefix>827</Prefix> <Alt>26433</Alt> </Feature> </ABC> <doc/>

Replies are listed 'Best First'.
Re: copy XML elements from one file to another
by tangent (Parson) on Nov 21, 2013 at 02:13 UTC
    As anonymonk points out there is no need to create the parsed elements from scratch:
    # parsing File 1 my @nodes_from_file1 = $parser->findnodes('//Feature'); # or whatever it is you need in each node # later my $new = XML::LibXML::Document->new('1.0', 'utf-8'); my $features = $new->createElement( 'Features' ); my $file1 = $new->createElement( 'File1' ); for my $node (@nodes_from_file1) { $file1->addChild( $node ); } $features->addChild( $file1 ); # and so on
    Using your sample data (which has errors btw) this would produce:
    <Features> <File1> <Feature> <Number>86839</Number> <Prefix>419</Prefix> <Alt>73924</Alt> </Feature> <Feature> <Number>82783</Number> <Prefix>826</Prefix> <Alt>27800</Alt> </Feature> <Feature> <Number>82783</Number> <Prefix>827</Prefix> <Alt>26433</Alt> </Feature> </File1> </Features>

      Thanks for your help. I was trying your soltuion but I cant figure out the follwoing:
      1. how I can view my output file? I don't see any generated output files
      2. I assume "parser" in this case will be parsing the source file (where im getting my input from) right ?

      I understand whats going on with your code but im just lost with where I see my results. I have tried multiple times but couldn't see any generated output file.
      tried different examples using the same approach but still cant figure out how to get my output file.

        how I can view my output file? I don't see any generated output files
        You can print directly to a file using the toFile() method. Add the following lines to the example given above:
        $new->addChild($features); my $filename = 'new.xml'; my $ok = $new->toFile($filename,2);
        I assume "parser" in this case will be parsing the source file (where im getting my input from) right?
        Yes, you would set up the parser like this, probably in a loop to go through each file:
        my $file = 'file-1.xml'; my $parser = XML::LibXML->load_xml(location => $file);

      Here is my code:

      use XML::LibXML; my $parser = XML::LibXML->new(); my $doc = $parser->parse_file("FILE1.xml"); my @nodes_from_file1 = $doc->findnodes('//Feature'); my $new = XML::LibXML::Document->new('1.0', 'utf-8'); my $features = $new->createElement( 'Features' ); #print to debug print $features->toString; my $file1 = $new->createElement( 'File1' ); for my $node (@nodes_from_file1) { $file1->addChild( $node ); #print to debug print $file1->toString; } $features->addChild( $file1 ); print $new->toString; $new->toFile("output_new.xml")
        I can't see the line $new->addChild($features);

        You need to include the format argument 2 in $new->toFile("output_new.xml",2) and print $new->toString(2);

Re: copy XML elements from one file to another
by kcott (Archbishop) on Nov 21, 2013 at 02:23 UTC

    If you're mostly just copying the contents of several files, perhaps consider not using a module at all. Something along the lines of this code (following some Notes) might do what you want.

    Notes:

    • You didn't specify exactly which parts of the source files were to be copied. I'll leave you to make changes depending on your requirements.
    • IDs should be unique. I've added code to do this. Again, modify how you want.
    • I've added consistent indentation. You may want something different: change to suit.
    • Assumed a typo: s{<doc/>}{</doc>}
    • I've used a cut-down version of your data for demo purposes.
    • I'll leave you to replace Inline::Files and \*STDOUT with more appropriate I/O.
    #!/usr/bin/env perl -l use strict; use warnings; use Inline::Files; my %doc_fh_for = (File0 => \*FILE0, File1 => \*FILE1, File2 => \*FILE2 +); my $out_fh = \*STDOUT; my $key_file = 'File0'; my @feature_files = qw{File1 File2}; print $out_fh '<Key="1234">'; write_xml_content($key_file, $doc_fh_for{$key_file}, $out_fh, ' ' x 4) +; print $out_fh ' <Status="in use"/> <Features>'; for (@feature_files) { write_xml_content($_, $doc_fh_for{$_}, $out_fh, ' ' x 8); } print $out_fh ' </Features> <Other_Status/> </Key>'; sub write_xml_content { my ($file_id, $in_fh, $out_fh, $indent) = @_; while (<$in_fh>) { chomp; if (/^<\/?doc>$/) { s/doc/$file_id/; } else { if (/^<ABC id="(\d+)">$/) { my $id = $1; s/$id/$file_id-$id/; } $_ = ' ' x 4 . $_; } print $out_fh $indent, $_; } }

    Inline::Files data:

    __FILE0__ <doc> <ABC id="1"> <Feature> <Number>86839</Number> </Feature> </ABC> <ABC id="2"> <Feature> <Number>82783</Number> </Feature> </ABC> </doc> __FILE1__ <doc> <ABC id="1"> <Feature> <Number>86839</Number> </Feature> </ABC> <ABC id="2"> <Feature> <Number>82783</Number> </Feature> </ABC> </doc> __FILE2__ <doc> <ABC id="1"> <Feature> <Number>86839</Number> </Feature> </ABC> <ABC id="2"> <Feature> <Number>82783</Number> </Feature> </ABC> </doc>

    Output:

    -- Ken

      well, you actually got exactly what im looking for. By taking a look at your code and the output example you provided, I can tell that you exactly understand what I am aiming at. Easy to track steps
      I'm working on it right now and I hope I can generate my desired output as you explained

      After looking into this solution I realized the use of Inline::Files is only for small virtual files. But in my case eahc of my file is hundreds of lines.
      anyway this can be modified to read the 3 files at the same time and grab information by ID, instead of inline files ?
      plus im having issues install the Inline::File module...Thanks for your help

        The use of Inline::Files and \*STDOUT was simply for demo purposes; hence the "I'll leave you to replace Inline::Files and \*STDOUT with more appropriate I/O.".

        I'm not sure whether you've got a handle on this. If you have, you probably won't need the following; if not, here's some hints and tips that may prove useful for your real code. Ask if you need more information, further explanation, etc.

        I'd recommend the autodie pragma. This will save a lot of effort checking whether files could be opened, created, read, written and so on. Put it near the top of your code; I usually write this:

        ... use strict; use warnings; use autodie; ...

        Remove the 'use Inline::Files; line.

        Use open to create your filehandles.

        The output filehandle ($out_fh) is very straightforward:

        open my $out_fh, '>', '/path/to/output_file';

        The input filehandles are a little more complicated but still fairly easy. I wouldn't recommend creating all the filehandles in advance; instead, open and close one at a time. The \*FILE0, \*FILE1, etc. in my original code are filehandles for the Inline::Files: you'll see I didn't need to explicitly create those. Perhaps change:

        my %doc_fh_for = (File0 => \*FILE0, File1 => \*FILE1, File2 => \*FILE2 +);

        to something like this

        my %source_file_for = (File0 => 'filename0', File1 => 'filename1', Fil +e2 => 'filename2');

        and then further down

        for (@feature_files) { open my $in_fh, '<', $source_file_for{$_}; write_xml_content($_, $in_fh, $out_fh, ' ' x 8); close $in_fh; }

        I don't know whether you have a predetermined list of source files or if you have to find them in one or more directories; if the latter, you may find readdir useful for this. Also note that may need to prefix the filenames with directory paths: while simple concatenation may suffice, I'd normally use File::Spec as it's portable (it's also a core module so there's no need to install it).

        -- Ken

Re: copy XML elements from one file to another ( XML::LibXML::Node#cloneNode )
by Anonymous Monk on Nov 21, 2013 at 01:45 UTC
Re: copy XML elements from one file to another
by Anonymous Monk on Nov 21, 2013 at 14:14 UTC

    Hey...Thanks all for the help I will work on this and see what I get. I guess what you all have provided is very helpful.