novas007 has asked for the wisdom of the Perl Monks concerning the following question:

I'm looking for a good method for translating an excel database in this format:

Lastname | Firstname | valuea | valueb | valuec
Lastname | Firstname | valuea | valueb | valuec
    |
   V

With 4 sheets, to the flatfile format:

lastname,firstname,valuea,valueb,valuec\n
lastname,firstname,valuea,valueb,valuec\n

i've looked at the program xlhtml (with the -xml flag) along with XML::Parser+Simple, but i still don't think xml is the best way to go. (mainly because i'm on a deadline and xlhtml outputs in a format that's really hard to parse the way i want).

Any clues/hints/hits_to_the_head would be appreciated. Thanks in advance.

Replies are listed 'Best First'.
Re: Excel - usable format
by {NULE} (Hermit) on Nov 06, 2001 at 03:57 UTC
    Hi,

    Your question is a little ambiguous, but if you are dealing with Excel there are several excellent modules that can help you. The only question that I have is if you actually need Perl for this task? Is this something you are automating or can you just save it in a CSV file?

    Spreadsheet::ParseExcel is supposed to do a good job of reading in Excel documents. I hear it does a good job with things like newlines embedded in cells that are otherwise hard to code for.

    I can personally vouch for Spreadsheet::WriteExcel as being a very reliable tool for creating Excel (and StarOffice) readable spreadsheets. My co-workers looked at me bug-eyed in awe when I showed them that Perl could output Excel. "Yeah, Perl can do that," is my motto these days. :)

    Good luck,
    {NULE}
    --
    http://www.nule.org

Re: Excel - usable format
by dws (Chancellor) on Nov 06, 2001 at 03:54 UTC
    I'm looking for a good method for translating an excel database in this format: ...

    If this is a one-shot conversion, you can manually save each sheet as a CSV (comma-separated value) file, then process them either directly or via File::CSV.

    If this is something you'll be doing frequently, consult OLE - Getting all rows from Excel for some example code that'll get you 95% of the way there.

Re: Excel - usable format
by Fastolfe (Vicar) on Nov 06, 2001 at 03:46 UTC
    I apologize if this is somewhat of a stupid question, but is opening up this file in Excel and clicking "File | Save as... | Save as type: CSV" not an option?

    I believe there are a few Excel:: modules that will allow you to read and write Excel data. You might be able to export this.

    If you literally mean you have input data with the format you mention above (thus making it pipe-delimited ASCII data, not an "Excel database"), it should be just a matter of splitting on the |'s and re-writing with commas, yes? There's some escaping to do (which Text::CSV can probably help with), but it should be fairly straightforward.

    I apologize if I misunderstood your question.

      You can also create an html representations of an Excel document that excel can load. The advantage of this approach over the Text:CSV format is that you get more control over the appearance of the spreadsheet. Column width, Fonts are controlable. You can also do formulas. The reason for doing this is to create Excel output from another program that is not excel. Try this: 1. Create using Excel a simple version of the spreadsheet you want to produce. SaveAs 'html' if you open it with notepad you will see a text format that can be produced by a program. 2. Rename the file with an xls extention. When you open it Excel opens it just as if it were an excel file. 3. Create your file using the format you saw in step #1 substitute your own values. Problems: If you have data in more than one worksheet in your excel file the SaveAs 'html' produces a file, a folder with a bunch of other files. This makes things much more complicated. In my case I am trying to make email attachements the multi file version of this is not good. Conclusion: If you are creating something with just one worksheet or have a more robust way of distributing your result the html format works well.
Re: Excel - usable format
by cacharbe (Curate) on Nov 06, 2001 at 06:46 UTC
    I'm going to ask a couple questions here, for clarification.

    • What Platform are you on?
    • How many files are there?
    If you're on Win32, try Win32::OLE...for the challenge, if nothing else *grin*.

    Otherwise the two Spreadsheet:: modules listed above will help greatly. (I'm also going to take this opportunity to suggest that you take a look at my Scratch Pad, where I've been putting together a bit of a FAQ/How-To regarding Win32::OLE and Excel - I'll post it under Tutorials when it's done, comments welcome).

    If it's just one file, why automate? Go through each sheet, SaveAs CSV, append them together, and move on.

    C-.

Re: Excel - usable format
by novas007 (Acolyte) on Nov 07, 2001 at 01:03 UTC
    In response to the Excel->CSV stuff, it's a slight possibility that the file could be saved in csv format, but frankly, i don't trust the inputters to follow directions. (you'd have to see them to believe me). I'll look into what was mentioned and try them out. Thanks :)