Dear Monks,
I am tasked to convert CSV to XML and have turned to the not-updated-in-15-years XML::CSV. It has a number of issues - lack on support for accented characters and poor error handling are some of them. What would you recommend for converting CSV into a custom, possibly rather complex XML format?
use warnings; use strict; use Test::More; use File::Spec::Functions; use XML::XPath; use lib catdir qw ( lib ); plan tests => 8; use_ok q{XML::CSV}; my $base = q{input}; my $csvfile = catdir q(csv), qq{$base.txt}; my $xmlfile = catdir q(xml), qq{$base.xml}; my $default_obj_xs = Text::CSV_XS->new({sep_char => ",", quote_char => + '"', encoding => "utf8"}); my $csv_obj = XML::CSV->new( { csv_xs => $default_obj_xs, error_out => + 1 }); # convert csv to xml, print to file my @arr_of_headings = map { "Col$_" } (1..9); $csv_obj->{column_headings} = \@arr_of_headings; $csv_obj->parse_doc($csvfile); $csv_obj->declare_xml({version => '1.0', encoding => 'UTF-8', standalo +ne => 'yes'}); $csv_obj->print_xml($xmlfile, {format => " ", file_tag => "Import", re +cord_tag => "Row"}); # test xml output my $xp = XML::XPath->new(filename => $xmlfile); my @nodes = $xp->findnodes(q{/Import/record}); cmp_ok(scalar(@nodes), q[==], 3, q[We have 3 nodes]); @nodes = $xp->findnodes(q{/Import/record[1]/Col2}); cmp_ok(scalar(@nodes), q[==], 1, q[We have 1 match on line 1]); ok(exists($nodes[0])); cmp_ok($nodes[0]->string_value(), q{eq}, q{AB12345}); @nodes = $xp->findnodes(q{/Import/record[3]/Col2}); cmp_ok(scalar(@nodes), q[==], 1, q[We have 1 match on line 3]); ok(exists($nodes[0])); cmp_ok($nodes[0]->string_value(), q{eq}, q{EF12345}); __END__ C:\IT\Temp\>prove t\02_poc.t t\02_poc.t .. 1/8 # Failed test at t\02_poc.t line 40. # got: '' # expected: 'EF12345' # Looks like you failed 1 test of 8. t\02_poc.t .. Dubious, test returned 1 (wstat 256, 0x100) Failed 1/8 subtests Test Summary Report ------------------- t\02_poc.t (Wstat: 256 Tests: 8 Failed: 1) Failed test: 8 Non-zero exit status: 1 Files=1, Tests=8, 1 wallclock secs ( 0.06 usr + 0.00 sys = 0.06 CPU +) Result: FAIL C:\IT\Temp\> C:\IT\Temp\>type "csv\input.txt" 1,AB12345,03.04.2016 15:43:14,-76775.70,Toll road INC,Bridge 55,19.8,0 +4.04.2016 06:55:41 2,CD12345,01.04.2016 16:39:15,-76775.70,Toll road INC,River Kwai,8.1,0 +4.04.2016 06:27:36 3,EF12345,01.04.2016 16:39:15,-76775.70,Toll road INC,Champs-Élysées,8 +.1,04.04.2016 06:27:36 C:\IT\Temp\>type "xml\input.xml" <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Import> <record> <Col1>1</Col1> <Col2>AB12345</Col2> <Col3>03.04.2016 15:43:14</Col3> <Col4>-76775.70</Col4> <Col5>Toll road INC</Col5> <Col6>Bridge 55</Col6> <Col7>19.8</Col7> <Col8>04.04.2016 06:55:41</Col8> <Col9></Col9> </record> <record> <Col1>2</Col1> <Col2>CD12345</Col2> <Col3>01.04.2016 16:39:15</Col3> <Col4>-76775.70</Col4> <Col5>Toll road INC</Col5> <Col6>River Kwai</Col6> <Col7>8.1</Col7> <Col8>04.04.2016 06:27:36</Col8> <Col9></Col9> </record> <record> <Col1></Col1> <Col2></Col2> <Col3></Col3> <Col4></Col4> <Col5></Col5> <Col6></Col6> <Col7></Col7> <Col8></Col8> <Col9></Col9> </record> </Import> C:\IT\Temp\>
--
No matter how great and destructive your problems may seem now, remember, you've probably only seen the tip of them. [1]

In reply to Convert CSV to XML by andreas1234567

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.