fmcroft92 has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I’m very new to Perl and programming generally. I have a real world problem whereby I want to convert a CSV file into an XML file. The structure of the xml file I need to create can be found here https://www.gov.uk/government/publications/co-ordinated-admissions-2021-series-17-files.

Would anyone be able to give me any guidance on how I could go about this and where to start?

Thank you.

Replies are listed 'Best First'.
Re: CSV to XML
by kcott (Archbishop) on May 08, 2021 at 00:57 UTC

    G'day fmcroft92,

    Welcome to the Monastery.

    [I appreciate this is your first post here; however, there are some issues with what you've presented. You should provide a small, representative example of your input and the output you would expect from that. See "How do I post a question effectively?" for more details about that. URLs should be provided as links, not plain text — "What shortcuts can I use for linking to other information?" has details on how to achieve that. I followed the URL you provided but only found links to a variety of examples and you had not shown any indication which of these would be appropriate: I wasn't prepared to hunt around and make guesses — it would have been better to use the sample output for this or, if there was detailed information, a direct link to whatever you're working with. The whole idea is for you to do sufficient up-front work so we aren't left making assumptions about what you want; the end result being that you get better answers, and get them a lot more quickly. Please understand that this is not a rebuke but purely information: just keep it in mind for any future posts.]

    I put together the following script to give you an example of the type of code you may need.

    #!/usr/bin/env perl use strict; use warnings; use autodie; use Text::CSV; use XML::LibXML; my $csv_file = 'pm_11132234_input.csv'; my $xml_file = 'pm_11132234_output.xml'; my $xml = XML::LibXML::Document::->new(); my %element_for_header = qw{Name name Example value}; { my $csv = Text::CSV::->new(); open my $csv_fh, '<', $csv_file; my @elements = map $element_for_header{$_}, @{$csv->getline($csv_f +h)}; my $csv_element = $xml->createElement('csv'); while (my $row = $csv->getline($csv_fh)) { my $row_element = $xml->createElement('row'); for my $i (0 .. $#elements) { my $node = $xml->createElement($elements[$i]); $node->appendText($row->[$i]); $row_element->addChild($node); } $csv_element->addChild($row_element); } $xml->addChild($csv_element); } $xml->toFile($xml_file, 1); # Just for testing print "*** $csv_file ***\n"; system cat => $csv_file; print "*** $xml_file ***\n"; system cat => $xml_file;

    Output:

    *** pm_11132234_input.csv *** Name,Example plain,abc ampersand,x&y *** pm_11132234_output.xml *** <?xml version="1.0"?> <csv> <row> <name>plain</name> <value>abc</value> </row> <row> <name>ampersand</name> <value>x&amp;y</value> </row> </csv>

    Notes:

    • Use Text::CSV to read your input. It will handle the plethora of tricky issues that can occur when parsing CSV files. If you also have Text::CSV_XS installed, it will be used and the code will run faster.
    • Note how XML::LibXML does much of the work for you: <?xml ... created automatically; tags are closed; characters like & are converted to &amp;; output is formatted. Don't make a rod for your back and attempt to code all of this by hand.
    • See autodie and consider using it for general I/O exception handling. It's a lexically scoped pragma so you can turn it off temporarily if there are places where it's not wanted.
    • Note the mapping of CSV column names to XML element names. This would be important if column headers included spaces or other characters that were inappropriate for element names; for example, you might have mappings like "'First Name' => 'first_name'" or "'Capacity (%)' => 'capacity_pc'".
    • See open if you're unfamilar with my usage. Always aim to use the 3-argument form of open along with lexical filehandles.
    • The code after "# Just for testing" is platform-dependent and only included for my demonstration purposes. It may not work on your operating system.

    — Ken

Re: CSV to XML
by 1nickt (Canon) on May 07, 2021 at 20:27 UTC

    Hi fmcroft92, welcome to the Monastery and to Perl, the One True Religion.

    Your task is common and very simple using existing Perl tools. There will be a learning curve since you are new to the language and programming in general as you said. I suggest you break up your task into chunks and work on them individually rather than trying to get the whole thing done at once.

    The anonymonk already pointed you to the two standard toolkits you can use. Why not begin with reading your CSV file into a Perl data structure that you'll later convert to XML? You'll need to install Text::CSV_XS and code like:

    use strict; use warnings; use Text::CSV_XS 'csv'; use Data::Dumper; use feature 'say'; my $array_of_hashes = csv( in => '/path/to/file.csv', headers => 'auto +'); say Dumper $array_of_hashes; # Now loop through each hash and convert to XML

    Hope this helps!


    The way forward always starts with a minimal test.
Re: CSV to XML
by Anonymous Monk on May 07, 2021 at 19:59 UTC
Re: CSV to XML
by tangent (Parson) on May 07, 2021 at 20:25 UTC
    I have done similar to this many times and my experience has been that while you should use the Text::CSV module to parse the CSV file input, the XML output is best produced within your script rather than trying to use one of the XML modules to write it.

    If you can show us some of the CSV file you will be using I can give you an example script. Please ensure that you obscure any personal information.

      the XML output is best produced within your script rather than trying to use one of the XML modules to write it.

      No.