FIrst, I don't understand why the <?xml ...?> lines from all the input files are being included in the single output file -- when I use my code as posted, it removes those from each input. Either you're running something different from what I posted, or else there's something odd about the <?xml... lines in your data files.

As for what you really want, which is one <shiporder ...> element containing all the content of all the files (that is, combining the "shipto" elements from all the input files into one "shiporder"), that's a different plan from what I was suggesting, and it would be best to use a parser for that.

In fact, it seems like the OP code is really pretty close to what you want. Here's my version, with Digest::MD5 thrown in to eliminate duplicate "shipto" content:

#!/usr/lib/perl use strict; use warnings; use Carp; use File::Find; use File::Spec::Functions qw( canonpath ); use XML::LibXML::Reader; use Digest::MD5 'md5'; if ( @ARGV == 0 ) { push @ARGV, "C:/file/dir"; warn "Using default path $ARGV[0]\n Usage: $0 path ...\n"; } # open an output file whose name won't be found by File::Find open( my $allxml, '>', "all_shiporders.xml.combined" ) or die "can't open output xml file for writing: $!\n"; print $allxml '<?xml version="1.0" encoding="UTF-8"?>', "\n<shiporder xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instanc +e\">\n"; my %shipto_md5; find( sub { return unless ( /[.]xml\z/i and -f ); extract_information(); return; }, @ARGV ); print $allxml "</shiporder>\n"; sub extract_information { my $path = $_; if ( my $reader = XML::LibXML::Reader->new( location => $path )) { while ( $reader->nextElement( 'shipto' )) { my $elem = $reader->readOuterXml(); my $md5 = md5( $elem ); print $allxml $reader->readOuterXml() unless ( $shipto_md5 +{$md5}++ ); } } return; }
That seems to work on a set of files such as the following, leaving out "j4.xml" because it's identical to "j2.xml":
==> j1.xml <== <?xml version="1.0" encoding="UTF-8"?> <shiporder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <shipto> <name>johan</name> <address>Langgt 23</address> </shipto> </shiporder> ==> j2.xml <== <?xml version="1.0" encoding="UTF-8"?> <shiporder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <shipto> <name>benny</name> <address>galve 23</address> </shipto> </shiporder> ==> j3.xml <== <?xml version="1.0" encoding="UTF-8"?> <shiporder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <shipto> <name>kent</name> <address>vadrss 25</address> </shipto> </shiporder> ==> j4.xml <== <?xml version="1.0" encoding="UTF-8"?> <shiporder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <shipto> <name>benny</name> <address>galve 23</address> </shipto> </shiporder> ==> j5.xml <== <?xml version="1.0" encoding="UTF-8"?> <shiporder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <shipto> <name>stewart</name> <address>vadrss 25</address> </shipto> </shiporder>
Here's the output -- the only difference between this and what you wanted is the absence vs. presence of extra line-feeds around the "shipto" tags, which is just a matter of cosmetics:
<?xml version="1.0" encoding="UTF-8"?> <shiporder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <shipto> <name>johan</name> <address>Langgt 23</address> </shipto><shipto> <name>benny</name> <address>galve 23</address> </shipto><shipto> <name>kent</name> <address>vadrss 25</address> </shipto><shipto> <name>stewart</name> <address>vadrss 25</address> </shipto></shiporder>

In reply to Re^3: Multiple XML files from Directory to One XML file using perl. by graff
in thread Multiple XML files from Directory to One XML file using perl. by jyo

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.