in reply to Re^2: problems with garbage collection (_exit)
in thread problems with garbage collection

Hello, thanks you all for the help. the POSIX::_exit(0) solution definitly helps and makes this part of the software MUCH faster and doesn't start swapping when it did before. I guess I would never have found that solution. I still have to test, how far we can go with the current hardware before we max out and if I can find some things to optimize the ram usage

Sorry, don't know how to make a small testcase. maybe it is helpfull when I describe the software abit. I currently have to make some old software working parallel to reduce the absolute time needed to work through a certain amount of datasets and make better use of the current hardware.

we have lots of datasets in several databases. now we get a new version of the datasets in csv files and want to compare them with the old (via fingerprint) to find changes. if changes happen, we write output exportfiles for other software and change the values in the databases

for this the parent process loads the fingerprint values (and other data) into a huge hash. now we fork as much processes as we have processors. because of the copyOnWrite in perl we already load a lokal copy for each process when just reading the original hash the amount of needed RAM explodes... :(
the current hardware is a 16 core system with 24GB of RAM and 8GB of swap (don't kill me, i am not the sysad behind the swap size decision;))
the largest examples we have go to about 3.5mio datasets, before the _exit() solution we already got problems with about 1.5 mio datasets.

I now have to find out how high we can go and if there are still ways to optimize the software abit, so we can use the software even for the largest amount of datasets. but thats some work for tomorrow :)

  • Comment on Re^3: problems with garbage collection (_exit)