in reply to Read Excel cell comments?

If you have and old Excel and you want to make XML, you can use OpenOffice instead. Download OpenOffice for free.

See my page (external link) Perl and OpenOffice. This page links to a tutorial and various example programs. I just checked, and the OpenOffice format does include the comments in the XML. Here is the synopsis: save the Excel file as a .ods file, unzip it, and parse content.xml.

It should work perfectly the first time! - toma

Replies are listed 'Best First'.
Re^2: Read Excel cell comments?
by AlwaysLearning (Sexton) on Apr 04, 2007 at 08:46 UTC
    Indeed!

    I had recently installed OpenOffice figuring to try it out sometime, so this became the time...

    Took two minutes to convert from .xls to .xml (spreadsheet is about 4MB), plus it is a manual step, so that would mean that of the 9 minutes of remaining time, reading .xml could save at most 7, because of the need to convert .xls to .xml first.

    Is there anything around that is similar to Spreadsheet::ParseExcel, for the .xml files, or would this be a brand new project starting from scratch (and the XML parser)?

    Might it not be easier to add comment parsing to Spreadsheet::ParseExcel? I haven't looked at its internals, and I currently know nothing of the Excel file format except that it seems to be documented somewhere, using a format called BIFF (which conjures up potential relationships of something called TIFF, but that may be coincidental)... so that might be like a brand new project starting from scratch too...

      You can convince OpenOffice to read the file and save the XML from the command line. You will have to write a few lines of code in the OpenOffice scripting language to do this. An example is in the paper on the web page that I referenced above 'Using Perl to Read and Write OpenOffice Documents.' See section #2 of the paper, 'Generating Web Banners' for example code and references for automating OpenOffice tasks.

      For XML parsing I used XML::Twig. There are examples of how to do this in the paper. It is not hard, but XML is a large topic. You could easily use one of the many simpler XML modules for this task.

      What you want to do is easy with this approach. For example, I wrote a program to extract the speaker notes from a PowerPoint presentation. See Section #1 of the paper for details. It took 20 minutes to implement the whole thing. Learning how to use this approach took much longer, but my paper should have enough to get you started.

      It should work perfectly the first time! - toma