in reply to How to reslove this memory issue

Three possible approaches:

  1. Buy another 8GB DIMM for your server.

    At ~£70/$100, this is quick, simple and very cheap.

    Datasets only ever seem to get bigger, so this would somewhat future proof you.

  2. Get cleverer about the way you build your indexes. (You currently use hashes for this.)

    Depending upon your data, there may be less memory intensive ways of building your indexes.

    A few (real or realistic) examples of the data showing the datatype (string/real/integer) of the keys and the size and nature of the values, would me far more useful than your script which I think you've adequately described.

    Should result in an equally fast (or possibly faster) processing; but requires 'cooperative data', so may not be applicable; and requires some rework of your script, though the basic structure would remain the same.

  3. Process each of your fields in separate passes of the script to produce intermediate output files, and then use a final pass over those intermediate files to merge them.

    Slow. Quite a lot of work.

A fourth approach would be to use a database, but I'll let others tell you about that.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.