snowch has asked for the wisdom of the Perl Monks concerning the following question:

I have about 30 xml files (customers, projects, tasks, etc) that I create when integrating customer systems. I would like to split the xml files into smaller chunks. Here is an example of the Customer.xml file:

<NikuDataBus> <Header xyz /> <Customers> <customer x /> <customer y /> <customer .. /> </Customers> </Header> </NikuDataBus>

I would like to split the files so they have the following structure:

Customer_x.xml

<NikuDataBus> <Header xyz /> <Customers> <customer x /> </Customers> </Header> </NikuDataBus>

Can anyone recommend how I should do this?

Replies are listed 'Best First'.
Re: splitting large xml documents
by TedPride (Priest) on Sep 06, 2005 at 09:09 UTC
    What you need is an XML parser to create a structure from your original file, which you can then modify and output as multiple XML files. XML::Simple or XML::Parser will probably do the trick.
      Both of these modules read the XML structure into memory. If the files are 'large' then this might be a problem. XML::Twig will handle this better.
        Am I on the write track with this? It seems to work ok.

        Many thanks...

        my $t = XML::Twig->new(); $t->parsefile( $Globals::input_xml_dir."TimesheetLoad_1.xml" ); my $rt = $t->root; # NikuDataBus my $hd = $rt->first_child( 'Header' ); my $tp = $rt->first_child( 'Customers' ); $tp->cut; my @tp1 = $tp->children( 'Customer' ); my $count = 0; foreach my $tp1 (@tp1) { $tp1->cut; $tp1->paste( after => $hd, $rt ); my $out; open ($out, ">$Globals::input_xml_dir\\twigs\\CustomerLoad_$cou +nt.xml"); $rt->print( $out ); close $out; $tp1->cut; $count++; }
Re: splitting large xml documents
by derby (Abbot) on Sep 06, 2005 at 13:28 UTC

    You coould always just use XSLT. Suppose your original XML is more along the lines of:

    <NikuDataBus> <Header name="xyz"> <Customers> <customer name="x"> <address>1313 Mocking Bird Lane</address> </customer> <customer name="y" /> <customer name="z" /> </Customers> </Header> </NikuDataBus>
    then an XSLT stylesheet such as this would do the trick:
    <xsl:stylesheet version="1.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes" /> <xsl:template match="customer"> <xsl:variable name="outFile" select="concat('customer.', @name, '.xml' )" /> <xsl:document method="xml" indent="yes" href="{$outFile}"> <NikuDataBus> <Header name="xyz"> <Customers> <xsl:copy> <xsl:copy-of select="*|@*"/> </xsl:copy> </Customers> </Header> </NikuDataBus> </xsl:document> </xsl:template> <xsl:template match="NikuDataBus"> <xsl:apply-templates select="Header/Customers/customer" /> </xsl:template> </xsl:stylesheet>

    -derby
Re: splitting large xml documents
by grantm (Parson) on Sep 09, 2005 at 00:16 UTC