Running it in mod_perl or PPerl with no forking should speed things up a lot. Even just running it all from one script would be better than all that forking. Reading all your data into RAM can be very fast, but puts limits on how much you can scale as more data gets added. Putting it into a format like a dbm file where you can efficiently access individual records works well for some things, and doesn't need to read the whole thing into memory. It is somewhat faster than MySQL.
Your understanding about copy-on-write shared memory is correct, but forking is not always useful. It's most effective in situations where you have a lot of I/O waiting.