Re^3: Out of Memory

I think this way.

given that all you need is to get XLSX data - probably to get some kind of DB values - then, just do not use this CPAN module, just extract your data yourself,
be your-application-centric.

But before doing that, drop a letter to CPAN author - it could be that he is responsive and will provide you with a solution soon.
Otherwise - just reuse his code of "unzipping" the content, and then use your own regular expression.

But better than that - feed you resulting XML string into properly constructed XPATH expression - and feed this XPATH expression to Xml::LibXML - it is very efficient on XPath expressions, but other modules dealing with XPath also will suffice.
(I - personally - have good experience with mentioned one, and TIMTOWTDI)

This would be best way out of this situation - this is how I feel it.

Is 40Mb - a size of ZIPped XLSX, or it is a size after unpacking?

Comment on Re^3: Out of Memory

Replies are listed 'Best First'.
Re^4: Out of Memory by ETLTCHFIG (Novice) on May 05, 2011 at 19:45 UTC
40MB is the Zipped XLSX file size - thanks for the guidance vkon! I am trying to use the code in the new method of the XLSX CPAN module and stick my functionality in there so I dont have to store anything in memory	[reply]
Re^4: Out of Memory by ETLTCHFIG (Novice) on May 05, 2011 at 20:31 UTC
OK - I THINK I GOT THE PROBLEM - In XLSX.pm the method new issues the following read foreach ($member_sheet -> contents =~ /(\<.?\/?\>\|.?(?=\<))/g) { This is trying to cache the whole worksheet at a time; one of my worksheets in the XLSX file that is throwing the "Out of Memory" has a million rows Is there a way I can change this to perform line by line reading?	[reply]
Re^5: Out of Memory by vkon (Curate) on May 06, 2011 at 06:17 UTC
optimizing such kind of constructs - is one of the ways in your situation. The answer to your question - 'yes',	[reply]