in reply to CGI daily 'cleanup' task

How did you arrive at the 33% figure? Why do you think a probabilistic solution is superior to a scheduled or a metric-based cleanup?

You don't say what you want to clean exactly, so I can only give general advice. It will make a difference if you are running mod_perl or not.

Use a system scheduler if you are recovering time-expired resources, like session files whose cookies are now invalid. For scheduled one-shot jobs, there is 'at' and it's cousins. Reserve 'cron' for regularly scheduled jobs. I'd be suspicious of having many processes rewriting crontab on the fly.

For a metric based solution, measure what resources may need to be released, and fork a cleanup only if the measurement is above some threshold.

For your proposed solution, ..manymonk..'s suggestion of using a lock file is good. Use sysopen and its mode flags to control locking behavior.

After Compline,
Zaxo