You are running out of memory because your program requires more virtual memory to complete, than your computer is capable of supplying!
At byte offset 113864 of your equation (the original, fully expanded one with numeric symbols):
...,15618,+,15619,+,15620,15621,15622,+,x,+,x,+,x,+,x,15623,15624,1562 +5,... 113864..............................................^
Where (using my best yet compression technique described below), your equation calls for the multiplication of two additive (+) terms containing 22665 & 186 subterms respectively resulting in a subterm containing 3.8 million terms.
This subterm is next used at offset 113952:
...,x,15630,x,15631,x,15632,x,15633,x,+,x,+,x 113952..................................^
where it is multiplied by another with 6 terms to produce a resultant containing 22,846,320 (+)terms. But, at the point where this caclulation starts, even with compression and token cache substitutions, the process is already consuming 1.7GB!
I estimate that it would require at least 6GB to complete that expansion. And there are two more to go! Obviously, with 32-bit machines being limited to (at most) 3GB of virtual memory per process, this is not going to work.
My DB/SQL skills are more dormant than real at this stage, but I do not see any easy (or hard) way of using a DB for this. Maybe if you re-described the operation of your postfix expression evaluator, someone with greater DB foo than me could see a way to approach this. I do not expect it would be very fast though.
One possibility is to move your application to a well specified 64-bit machine with say 8GB of ram.
Another it to start caching the intermediate terms to disk. Besides being very slow, the more I think about this, the harder it gets. Essentially you would have to load the smaller of the next two terms from a disk based stack(??How to implement??), into memory, break it to its subterms, and then read the larger in, subterm by subterm, and output the result back to disk on the fly. Distinctly non-trivial.
The final possibility that I see, and have been pursuing as time permitted, is to find a more aggressive compression strategy than I have been using till now.
Any subterm of of a subequation like 1,2,3,4,5,6+1,2,3,4,5,6,7,8,9+7,8,9 can be reduced (in size) by caching the common subsequences of multiplicative terms and subtituting a symbol. So the above caches 1,2,3,4,5,6 and assigns symbol A; caches 7,8,9 as B; and the result becomes A+A,B+B. And the multiplicative terms A,B can in turn be cached and C substituted.
Once the calculation is complete, these substitutions can be (recursively) expanded on-the-fly as the result is written to disk.
The problem is, even with this symbol caching, the equation as the point outlined above consisting of purely additive terms. Eg. PQRS+HIJK+ABCD...(22,million) and it is still breaking the memory bank. And you cannot do symbol substitution on sequences of additive terms because multiplication needs access to the individual elements.
However, the largest cache token I've seen in use is only four characters (eg.'PQRS'), which means there must (at the point I've managed to get to so far), be less than 456,976 (26^4) unique terms in the cache. And as the largest subequation (so far) contains 3.8 million terms, there must be many duplicates. Eg.
ABCD+IJKL+ABCD+PQRS+ABCD+IJKL...
Which could be reduced to:
(ABCD*3)+(IJKL*2)+(PQRS*1)...
and those subterms can safely be used in subsequent calculations for a further considerable reduction in memory usage. They can then be expanded on output as above.
I was still trying to a) convince myself this was correct; b) work out how to code it; when I saw your latest post.
I also wonder quite what you are going to do with the results of this? I don't see that the human brain is going to derive much information from a several gigabytes of symbolic mush.
I think I recall you mentioning Excel? I seriously doubt Excel's capability to even load this much data on a 32-bit machine, never mind do anything useful with it?
In reply to Re: Out of Memory 2.
by BrowserUk
in thread Out of Memory 2.
by dneedles
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |