Re: Profiling a forking program

Replies are listed 'Best First'.
Re: Re: Profiling a forking program by Tardis (Pilgrim) on Oct 15, 2002 at 11:50 UTC
The main part of the program (that serves an actual request) has 'issues'. It's a web-based frontend to a DBM database. Thus, concurrent write access demands a single process locks the DBM file, and gives up it's lock after. While the locking is working, performance of the software for multiple users is many many times worse than one would expect, given performance when there is no contention is excellent. I suspect there may be some sort of lock thrashing or similar. I need to profile this main part of the program to determine what's going on. The profilers I've seen all want to write to a single file, which is useless for the forked program, since it will be stomped by subsequent invocations.	[reply]
Re: Re: Re: Profiling a forking program by robartes (Priest) on Oct 15, 2002 at 12:00 UTC
Ah, so I assume you want to profile your program to see how much time it spends waiting for a lock? One way you could have a go at this, given that your profiling tools only reliably work on one process, is to run two instances of your server, but accessing the same DBM file. Of course, this assumes the locking is done externally from the program (for example with a lock file instead of a semaphore in shared memory or some such beastie). Assuming this is the case, than you run one instance of your server and put load on it until it starts to slow down. You then run another instance, on a different port presumably, and limit that one to one connection only (either by connecting only once or putting limitations in the forking code, heck, even by not forking at all). You can then reliably profile that server's execution. As long as you generate load on the other server instance, and hence generate contention for the lock, you should get a reliable answer from this as to whether your program spends most of its time waiting for the lock. CU Robartes-	[reply]
Re: Re: Re: Profiling a forking program by kal (Hermit) on Oct 15, 2002 at 18:17 UTC
Locking a DBM file in that manner is going to be problematic. You're always going to run into problems - starvation, for example, where a process ends up waiting a very long time for a lock to be freed, because you have no queue ordering. Perhaps a solution to look up would be to have one 'thread' (fork, whatever ;) access the DBM file on behalf of the other processes: it could just lock the file, and then access it for the other threads, so immediately you gain from removing the startup cost of tieing the DBM. Another gain can then be made by 'queueing' requests - you could have an in-memory shared queue object, or perhaps a file FIFO. Obviously, you still need to lock access to this object, but since you'll be in and out of that object reasonably quickly it won't affect you as badly as something like a DBM file, and you also rule out problems like starvation. Obviously, I'm simply outlining something which is actually fairly complicated, but generally the fork on request model doesn't work very well in terms of scaling. You more often see the helper-thread model, which tends to scale a bit better. Given only one process can have access to the file at a time, it makes much more sense to only have one process access it :) Having that process write the answer back to the client is fairly easy, and the helper threads would make sure all requests are queued in a timely fashion. The other answer is to move to an RDBMS :o)	[reply]
Re: Re: Re: Re: Profiling a forking program by Tardis (Pilgrim) on Oct 15, 2002 at 22:39 UTC
I hear you :-) An RDBMS is in the works, but it's a long way off. I'm really looking for band-aid fixes - I'm sure that some optimization of the current backend is possible. However your other ideas are intruiging and I may ponder them for a while.	[reply]
Re: Re: Re: Profiling a forking program by graff (Chancellor) on Oct 17, 2002 at 05:29 UTC
If your web front-end is triggering writes to the DBM file, and if the quantity of additions/updates is significant, then the problem may be in the DBM module. You could try benchmarking just that part of the application, with or without multiple threads but making sure to simulate a reasonably heavy load of data to be absorbed. (E.g. I know that GDBM really crawls once you start adding data beyond a certain threshold, I think because it has to re-write its entire index at intervals.) But on the other hand, maybe moving to an RDBMS needn't be so far off as you seem to think -- MySql won't be that hard to install, and getting it working within your current perl/web framework might be easier than you expect. That's worth looking at, seriously.	[reply]
Re^4: Profiling a forking program by Aristotle (Chancellor) on Oct 17, 2002 at 13:31 UTC
And you don't even need to install mySQL to begin with. DBD::SQLite uses the brillant SQLite embeddable SQL engine. And Tie::DBI lets you work with much the same code you already have working with a DBM library. I'm seriously considering forgoing DBM entirely whenever I want to use one in favour of these modules (minus Tie::DBI possibly ie writing my own SQL, depending on mood). Makeshifts last the longest.	[reply]