in reply to Waiting ...

Whilst using Spreadsheet::ParseExcel is an option, there are other options, such as opening the input data with Excel and saving the original spreadsheet as a .csv and parsing that.

A text editor can help with basic, simple changes, such as extra heading rows and superfluous rows at the bottom. But you can use the power of Perl to really munge the data; Perl is fast at reading large text files, building hashes, etc.

I hope I haven't missed the point of what you are doing

--
Missquoting Janis Joplin:


Oh Lord, won’t you burn me a Knoppix CD ?
My friends all rate Windows, I must disagree.
Your powers of persuasion will set them all free,
So oh Lord, won’t you burn me a Knoppix CD ?

Replies are listed 'Best First'.
Re^2: Waiting ...
by northwind (Hermit) on Jun 01, 2005 at 11:05 UTC

    Splitting out to .csv files is a good idea, but it does have some issues.  First, a .csv can only save out one sheet at a time; manually saving out 200+ sheets would be mind-numbingly painful (hopefully VB would/should be able to automate this task).  Second, the resulting 200+ files are coming close to the maximum number of open file handles a 32 bit Perl can handle (depending on how many derivative files are created it could be uncomfortably close).  The number of file handles becomes an issue because it sounds like the OP is doing some massive cross-referencing between sheets (meaning most, if not all, of the .csv files would need to be open).  Seconding a previous post, this file should have been entered into a database long ago...

      That is exactly what I did. All sheets had one of two distinctive formats and I read them and joined them in two csv files.
      Manually doing this however was not an option as this manual data processing is exactly what is so mind numbing and what I hate.

      Cheers,
      PerlingTheUK
Re^2: Waiting ...
by PerlingTheUK (Hermit) on Jun 01, 2005 at 22:06 UTC
    You exactly got the point. I had this Excel spread sheet and had to do some clever data matching. Just reading excel is to slow so I wanted to get it into a scv format. Only 2Gig of memory are not sufficient to open a 66MByte Excel File in Spredsheet::ParseExcel, So i did this directly via Win32::OLE, which opens Excel to communicate directly. However, communicating and formatting 12MByte (Actual Text data) took 2 long hours to parse.
    It somehow just strikes me that is quite slow and in CSV file data rates equals 13.6kbps. Why do I think Office is a big pile of s+$%^?

    Cheers,
    PerlingTheUK
      It somehow just strikes me that is quite slow and in CSV file data rates equals 13.6kbps. Why do I think Office is a big pile of s+$%^?

      I wonder if OpenOffice Calc would fare any better, as it can read and write .xls format.

      --

      Oh Lord, won’t you burn me a Knoppix CD ?
      My friends all rate Windows, I must disagree.
      Your powers of persuasion will set them all free,
      So oh Lord, won’t you burn me a Knoppix CD ?
      (Missquoting Janis Joplin)