in reply to XML to CSV

It would be a lot easier to tell if the XML file could even become a table if it was formatted nicer. However it looks as if the document is a little complex for CSV, especially if this apparent header information is important:
<ProtocolIDs><Primary +ID><IDString>EORTC-06011</IDString></PrimaryID><OtherID><IDType>Alter +nate</IDType><IDString>SUPERGEN-EORTC-06011</IDString></OtherID><Othe +rID><IDType>Alternate</IDType><IDString>GMDSG-EORTC-06011</IDString>< +/OtherID><OtherID><IDType>ClinicalTrials.gov ID</IDType><IDString>NCT +00043134</IDString></OtherID><OtherID><IDType>Alternate</IDType><IDSt +ring>EudraCT-2005-002830</IDString></OtherID></ProtocolIDs>
Also the records seem to have a large ammount of metadata. This XML document really doesn't seem to be easily flattened.

Replies are listed 'Best First'.
Re^2: XML to CSV
by Blue_eyed_son (Sexton) on Jan 29, 2007 at 22:49 UTC
    Hi Trizor--Thanks for your response. When I use XML::Simple, it successfully reads it in. Is there an easy way to transform the object I get from
    $data = $xml->XMLin("CDR256224.xml");

    Into a tab-delimited or csv file?
      No, its not easy because the dificulty is in the nature of the data. The record doesn't easily break down into simple columns: you seem to be able to have several variable number things. CSV files don't work well with variable numbers of data points per record.

      To take your object and turn it into a csv you would first have to remove the header component of the document, then for each record figure out which feilds are variable, which record has the greatest number, and assemble your CSV that way. If you only need the header data then I reccomend something like XML::XPath to select the header nodes and extract their values, then assemble an AoA and dump that to your CSV.

      Basically the data is too complex for a csv to handle easily, and you'll need to make some representation and implementation choices.