in reply to Convert CSV file to XML file using Perl?

Also, consider all of your possibilities ... including, “writing no(!) computer program at all.”

A CSV to XML Converter in XSLT 2.0.

It can go the other way, too:   XML to CSV Using XSLT.

In both cases, as you can see, the transformation is being accomplished using a declarative feature of industry-standard XML libraries such as libxslt, which of course are fully supported by Perl libraries such as XML::LibXML, but also by command-line tools such as Saxon.   You specify what transformations you want, using an XML-based format to do so.   It is the responsibility of the XSLT processor (and there are several of these) to obey those instructions.

There are widespread applications for this idea.   For example, many documentation files are written in XML-based “semantic markup” languages such as DocBook, and then translated ... using XSLT ... into several different output formats.   (All of the O’Reilly books ... the ones with animals on the cover ... are done that way.)   No custom programming is involved.

I’ll leave it to others ... and to you ... to decide what is the overall best way to accomplish this goal for your particular project.   With Perl, you have many, many choices.

Replies are listed 'Best First'.
Re^2: Convert CSV file to XML file using Perl?
by Anonymous Monk on Jul 04, 2015 at 19:21 UTC
    All of the O’Reilly books ... the ones with animals on the cover ... are done that way.

    Nope.

      Thanks all. Here is the SQL which generates the XML file. However it takes days to generate the XML file for 20 million records. I do have proper indexes on the tables. Could you please provide your ideas to generate this XML. Which one is the fast method to build this XML?. 1. Extracting the data to CSV file takes about 15 minutes using ETL and generate the XML using Perl. 2. Otherwise writing a SQL by joining all the normalized tables and generate the XML file by calling the SQL within the perl.

      select xmlserialize(document        XMLElement("transactiondetails",                   XMLElement("rltp_id", rltp_id),                   XMLElement("rltp_name", rltp_name),                      (SELECT XMLAGG(XMLElement("product",                                    XMLElement("prod_id", p.prod_id),                                    XMLElement("prod_name", p.prod_name) +,                          (SELECT XMLAGG(XMLElement("customer",                                        XMLElement("cust_id", cust_id),                                        XMLElement("cust_name", cust_nam +e),                                   (SELECT XMLAGG(XMLElement("account",                                              XMLElement("acc_id", acc_i +d),                                              XMLElement("acc_name", acc +_name),                                              XMLElement("acc_balance", +acc_balance),                                             nvl(                                               (                                                select xmlconcat(                                                         xmlelement("ac +c_type", at.acc_type),                                                         xmlagg(xmlelem +ent("acc_code", at.acc_code))                                                       )                                                from acctype at                                                where at.rltp_id = a.rl +tp_id                                                and at.prod_id = a.prod +_id                                                and at.cust_id = a.cust +_id                                                and at.acc_id = a.acc_i +d                                                group by at.acc_type                                                )                                              , xmlconcat(xmlelement("a +cc_type"),xmlelement("acc_code"))                                                ) ,                                         (SELECT XMLAGG(XMLElement("tra +nsaction",                                                 XMLElement("txntrack",                                                 XMLElement("txn_id", t. +txn_id),                                                 XMLElement("txn_amt", t +.txn_amt),                                                (                                                 select nvl(                                                          xmlagg(xmlele +ment("txn_code",tt.txn_code))                                                        , xmlelement("t +xn_code")                                                        )                                                 from txntype tt                                                 where tt.rltp_id = t.r +ltp_id                                                 and tt.prod_id = t.pro +d_id                                                 and tt.cust_id = t.cus +t_id                                                 and tt.acc_id = t.acc_ +id                                                 and tt.txn_id = t.txn_ +id                                                )                                                ))                                                ORDER BY t.txn_id                                                )                                          FROM transaction t                                          WHERE t.rltp_id  =a.rltp_id                                          AND t.prod_id   =a.prod_id                                          AND t.cust_id   =a.cust_id                                          AND t.acc_id=a.acc_id))                                         ORDER BY a.acc_id                                         )                              FROM account a                              WHERE c.rltp_id=a.rltp_id                              AND c.prod_id=a.prod_id                              AND c.cust_id=a.cust_id))                             ORDER BY c.cust_id                             )                          FROM cust c                          WHERE p.rltp_id = c.rltp_id                          AND p.prod_id = c.prod_id))                         ORDER BY p.prod_id                         )                       FROM product p                       WHERE p.rltp_id = r.rltp_id )                             ) indent) AS xml  FROM    rltp_mngr r    WHERE rltp_id='10';

        Assuming that your dump to CSV is ordered like you showed in the original post, I'd probably go with dump to CSV and then write some perl to read the CSV line by line and build up the XML using XML::LibXML. Perl is pretty quick at that sort of stuff. I have some code that's all perl calling MySQL and for stuff that's a few thousand records it generates a bunch of HTML files (not too different from building up XML) plus separate text audit files by the time I've lifted my finger off the return key. For stuff that's reading a few thousand lines of CSV and doing inserts into a database of millions of lines (including some lookups to decide how to cross reference things) it takes a few minutes.