in reply to Apache::Session problems under high load

I'm a developer working with Anonymous Monk on this prob, here's a more detailed outline of the issue (or at least a specific test which seems to show up an issue which we think is affecting our web app under load):

I have a script, test-sessions.pl, which uses Parallel::ForkManager to send N simultaneous requests to a cgi script, session.pl. It makes one request first to get a sessionID, then uses that sessionID in each of the forked requests. The cgi script, session.pl, writes a unique value into the session representing the incremental request number. If session locking really works, the session should contain all of those unique values at the end, right?

here's a snippet from test-sessions.pl:

my $loop=8; my $pm = Parallel::ForkManager->new($loop); my $url = "http://localhost/cgi-bin/session.pl"; my %sessionopts; # must contain same config as apache has my $sid; # get sessionID with initial request to session.pl # ... for (my $i = 0; $i < $loop; $i++) { $pm->start and next; my $wait = int((rand)*3000000); # pass loop number, random wait time and sessionID in to script my $content = "count=$i&wait=$wait&sid=$sid"; my $requrl = $url.'?'.$content; print "sending req $i [$requrl] at ".join(':',gettimeofday)."\n"; my $req = HTTP::Request->new('GET', $requrl); my $resp2 = $ua->request($req); $pm->finish; } sleep 10; # be sure last req has finished print "now what is in the session for [$sid]?\n"; my %session; tie %session, 'Apache::Session::Flex', $sid, { %sessionopts }; print "got session data: ".Dumper(\%session)."\n";

and here's a bit from session.pl, the cgi script:
# ... # extract cgi args from query string # open session, either using Apache::SessionManager or Apache::Session + directly # ... usleep($wait); # wait for a few microseconds to simulate real server d +oing stuff $session->{'time'} = join('+',gettimeofday); $session->{'querystring'} = $query; $session->{'count'} = $count; my $now = time(); $session->{"$now:$count"} = $count; warn "got session ".Dumper($session);


So what this does is insert a unique key for each request, based on its time and loop number. Looking at the apache logs, I can see the values getting put into the sessions.

If locking worked correctly, you would expect to see a value for every request once they have all been processed. What I see is something like this:

now what is in the session for [8f13e7d0f1d140f6a1100510c1f82f6a]? got session data: $VAR1 = { 'count' => '6', '1098930095:0' => '0', 'time' => '1098930095+468658', '_session_start' => 1098930095, '1098930095:6' => '6', '1098930095:3' => '3', '_session_id' => '8f13e7d0f1d140f6a1100510c1f82f6a', 'current_uri' => '/cgi-bin/session.pl', 'querystring' => 'count=6&wait=173631&sid=8f13e7d0f1d140f6a1 +100510c1f8 2f6a', 'previous_uri' => '/cgi-bin/session.pl' };


Some of the requests are in there and some aren't.

If I comment out the usleep in session.pl, it mostly works, but not always. With it in, often only the data from the first and last of the forked requests is in the session once they're all over. This says to me that without the usleep, the session.pl request can be dealt with in as much time as it takes to fork a new perl process, so they mostly happen in serial. With the usleep, the requests overlap and some of them miss out on writing to the session.

I've tested this with Apache::SessionManager and without, using File, MySQL and Semaphore locking, on Fedora core 1 and RedHat 9. The locking is definitely being done, as I can see lockfiles made, sql statements etc. But somehow it doesn't seem to be doing what I expect. Am I expecting the wrong thing?

Replies are listed 'Best First'.
Re^2: Apache::Session problems under high load
by perrin (Chancellor) on Oct 28, 2004 at 04:55 UTC
    First of all, you really need to tell us a specific set of Apache::Session::Flex args or a specific subclass that you want to use. The behavior is different for different ones. For Apache::Session::File, you need to pass the "Transaction => 1" argument, as borisz said, to make it do exclusive locking the way you are imagining it. With Apache::Session::MySQL, you do not need that, although it shouldn't hurt anything.

    The other thing is that you should make sure your application truly needs this. Hardly any web apps do. Typical data to store in a session is stuff like a user ID or part of a multi-page form. These are things where "last save wins" is just fine. If you're storing things where you really need exclusive locking, you should probably consider putting them in a database instead.

      Thanks. 'Last save wins' would be fine as long as we could guarantee that the last save was actually the last request the user made.

      The fundamental problem we're having is that intermittently, under relatively heavy load, sometimes the app just does not get the session data written on the previous request. I came across a situation recently where opening a page in a javascript window set up a kind of race condition with session data ... the request still being processed in the parent window seemed to prevent the request in the child from writing to the session. So we hypothesised that this might be what is happening (especially as Apache::SessionManager doesn't untie the session until the very end of the request, after content has been sent back to the client) and investigated the locking process to see if it was possible, because we thought locking should prevent that.

      The most frustrating aspect to this session problem is that it affects the real users of the system much more than it does us the developers. We've only encountered the error once in testing, whereas they're getting it quite frequently. We're geographically very distant from the server whereas the users are quite close, so this sort of led me down the path of speculation that it's like a race condition between requests (because theirs would get back to the server much faster than ours). We've disabled javascript and redirects that send further content and this has improved things, but not fixed them. If we can't solve it, maybe we do need to look at storing the session data some other way.
        Apache::Session can do the kind of exclusive locking you are after. If you're not using the MySQL locking, make sure you're passing "Transaction => 1". However, it would be better to move this kind of work to direct database actions so that you can get better control over the concurrency and locking behavior.

        Sessions are good for things that are mostly read-only and don't change often (e.g. user's name). They're not so great for things that need carefully controlled locking behavior. Part of the issue is that when you use exclusive locking, a person opening a second window on your site will not get any content until the first window completes sending to the browser. That's the nature of exclusive locks.

        You problem is one of design (you've designed a race codition). Session managment is not a magic cure, and neither is testing. Rework your logic.