perlParse has asked for the wisdom of the Perl Monks concerning the following question:

I have few xml files generated on running some perl script on a document. This xml files list the property of the document. For ex. a word document having properties like author, title, creation date, modification date, size etc are listed in this xml. Now i have to compare this xml with the excel file in which I have manualy written the document properties. Purpose is to test if the perl script is generating proper xml file by comparing it with excel having the property names in column and values in the rows. For this I m thinking to parse the excel file in xml format and then compare both the xmls. Is my approach correct and how to code for this in perl?
  • Comment on excel to xml conversion and then comparison with another XML

Replies are listed 'Best First'.
Re: excel to xml conversion and then comparison with another XML
by mscharrer (Hermit) on Apr 24, 2008 at 08:57 UTC
    In my opinion Perl should be used to test if MS Excel is producing valid XML, not the other way around.

    If you like to test your script output, which is a noble thing, then I would code the reference XML either by hand or take a more standard tool for it then Excel.

    You can compare both XML like you said. I would use XML::Simple to read both XML files into separate hashes which can that be compared. Test::More gives you the function is_deeply to do this, but I'm not sure if you can easily use it in a non-Test Perl script. On the other hand your script is basically one.

      Do you mean to convert the xml file which is generated as a output of perl script to excel. Then compare both the spreadsheets i.e. the spreadsheet which I created manually with the one converted from xml. will this be more convinient than comparing 2 xmls?
        No, comparing the two XML files would be better. My point was that you take the XML generated by Excel as reference, i.e. you assume that Excel will produce correct, valid XML and differences to the Perl XML would mean that the Perl script made an error. I personally trust the Perl XML generator more than Excel. So I would take the Perl XML output as reference if Excel is generating correct XML.

        But don't start a Linux/Windows or Perl/MS Office flame war here ... you can use the modules I mentioned in my last post to compare the two XMLs, you just have to critical interpret the results.

Re: excel to xml conversion and then comparison with another XML
by mirod (Canon) on Apr 24, 2008 at 09:35 UTC

    It's a bit of a weird thing you're doing. Why create the control data in excel instead of in a more programmer friendly format?

    One way to do what you want would be to use DBD::AnyData to read both the XML and the excel file (exported as CSV, maybe using Spreadsheet::ParseExcel), and compare them.

Re: excel to xml conversion and then comparison with another XML
by Jenda (Abbot) on Apr 24, 2008 at 12:18 UTC

    Compare the data, not the XML! You know what differences are meaningful, not some generic XML comparison tool/function. For example are the following two snippets equivalent?

    <foo> <bar>1</bar> <baz>2</baz> </foo> <foo> <baz>2</baz> <bar>1</bar> </foo>
    ? Well, that depends. The order of the tags may matter and it may be completely irrelevant.

    Parse the XMLs you need to verify, parse the Excel file (save is as CSV, that's gonna be easier to work with) and compare the data that should be the same.