Re^2: Out of memory issue

Hi planetscape, the script runs through thousands of directories finding any images and stores the paths inside an array.

It then runs through this huge array and, for each image, extracts the IPTC information and writes it to a text file (one image per line).

The script breaks when it's performing this second step, and the text file it was writing reaches a size of 151 MB by the point it breaks.

Nothing has changed with the script itself, but the image archive (directories) has grown in size.

Comment on Re^2: Out of memory issue

Replies are listed 'Best First'.
Re^3: Out of memory issue by jethro (Monsignor) on Mar 19, 2010 at 01:11 UTC
How about not storing all the paths into this huge array but processing each path immediately? If this is not possible because you need to do some sorting, how about presorting and storing the paths into (for example) 5-10 files and then reading each file separately.	[reply]
Re^4: Out of memory issue by ralphch (Sexton) on Mar 19, 2010 at 01:33 UTC
Thanks for your reply. I have now increased the kernel datasize to 1 GB to see if this works (currently running the script again. However, I'll be making optimizations to the code to try to reduce the amount of memory it uses. I'll keep you updated.	[reply]
Re^3: Out of memory issue by SuicideJunkie (Vicar) on Mar 19, 2010 at 13:35 UTC
Some questions: Why are you storing the paths in an array? Why not process the files as you come across them, or at least use a hash to avoid tons of data repetition? Also, how much are you burning to load the image files? Are some of your images in the 100's of Megs range? Could you seek instead of slurp? If you can show some code it would pre-answer a lot of questions like these.	[reply]