in reply to child process dies soon after fork() call

I would be looking for ways to make the child processes use less memory. (Are input files being slurped when they could be processed one record at a time? Is each child making unnecessary copies of its input data, e.g. by reading a whole file into a scalar then splitting into an array? Are there complex data structures where simpler storage would do? Would it make sense to use additional disk-based resources instead of in-memory data structures, e.g. dbm files or other database(-like) storage?)

Failing that, I'd be checking whether it's really necessary to have four children running at once. What does that quantity get you that you don't get with two consecutive jobs with two children per job?

If there is just one factor that makes the difference between "it works" and "it fails", and that one factor is the size of the input files, and it turns out that these files are just always getting bigger, you've got a scaling problem, which is a sort of design problem. Anything that doesn't solve the design problem is just going to be a stop-gap, temporary fix with a limited life-span.

Solving the design problem is a matter of figuring out how to complete the task within a finite amount of ram, such that the process runs with a stable and consistent footprint no matter what size the input data may be.

  • Comment on Re: child process dies soon after fork() call