in reply to Re: apache log parsing
in thread apache log parsing

++, although "Time Served" seems to point me into thinking the OP wants to know average time "viewed" per URL or some such. In this case the problem becomes a little more difficult, HTTP is a stateless protocol where you can't really tell that a user is on the same page from lasthit -> thishit in the log. You can define what you consider a page view duration with the last fact accepted, you just have to define it for yourself -- it does not really "mean" anything in the real world.

So to do a view duration you need a session of some type (based on key fields like client IP, timestamp and others in the log) or better yet a real session ID in the log that tells you for sure this is the same session -- not someone else on the same proxy server. After you have that defined you need to suck in the log and make a correlation between sessionID_last_hit and sessionID_next_hit The time that elapses between those two is the "viewed" time for the "SessionID_last_hit" page. Note you will want to have tests for outlying hits (what if the user started on index.html at 1pm yesterday, then started on page2.html today at 1pm, is that a 24hour view?) and do sanity cleaning on those cases. For examples of how some people have implemented this take a look at awstats.


-Waswas

Replies are listed 'Best First'.
Re: Re: Re: apache log parsing
by dragonchild (Archbishop) on Mar 23, 2004 at 13:24 UTC
    Actually, what you're calculating is load-time-from-last-hit + wait-time-before-next-hit + latency-for-next-hit. What I have seen done is to have a hook into the Apache preprocessing create an entry in some sessioning database, using some session ID. Then, a hook in the Apache postprocessing will write the time for that session. The difference between those two times is the processing time. In fact, I'm a little surprised there isn't a hook in mod_session for this ...

    ------
    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose