in reply to Re^2: problems with garbage collection (_exit)
in thread problems with garbage collection
Sorry, don't know how to make a small testcase. maybe it is helpfull when I describe the software abit. I currently have to make some old software working parallel to reduce the absolute time needed to work through a certain amount of datasets and make better use of the current hardware.
we have lots of datasets in several databases. now we get a new version of the datasets in csv files and want to compare them with the old (via fingerprint) to find changes. if changes happen, we write output exportfiles for other software and change the values in the databases
for this the parent process loads the fingerprint values (and other data) into a huge hash. now we fork as much processes as we have processors.
because of the copyOnWrite in perl we already load a lokal copy for each process when just reading the original hash the amount of needed
RAM explodes... :(
the current hardware is a 16 core system with 24GB of RAM and 8GB of swap (don't kill me, i am not the sysad behind the swap size decision;))
the largest examples we have go to about 3.5mio datasets, before the _exit() solution we already got problems with about 1.5 mio datasets.
I now have to find out how high we can go and if there are still ways to optimize the software abit, so we can use the software even for the largest amount of datasets. but thats some work for tomorrow :)
|
|---|