in reply to viewer eyeball counter?

As you've already heard, this "can't be done due to the statelessness of the HTTP protocol", however another way to approach this is to do web log analysis, and infer page eyeball seconds from that information.

The beauty of this approach is that you don't need to retrofit any of your CGIs -- just look through the log file that probably already exists.

--t. alex
Life is short: get busy!

Replies are listed 'Best First'.
Re: Re: viewer eyeball counter?
by waswas-fng (Curate) on Dec 08, 2003 at 22:29 UTC
    The fault with this is that unless you use session unique session ids to track visitors on your website and store these in the log file you have to make assumptions:
    If I see the client IP is a.b.c.d and the next hit is from a.b.c.d then he must be the same visitor. (oops what about proxy servers? AOL? MSN? woopsie.)

    If I see a delay of X minutes between hits then I am going to count this client IP as a new visitor, otherwise it is the same session and the session length is <time of last hit> - <time of first hit in session> long.

    Depending on what level of accuracy you are looking for (and what percentage of your hits come from proxies) looking at the log files for this information just gives you faulty info.


    -Waswas

      Well, I guess I could have put in the standard disclaimer about how none of this makes sense if you're dealing with client IP addresses that change (which they will for anyone behind an AOL firewall), session IDs that change, multiple servers (in which case you have to merge the log files then do your parsing), and so forth. But I presented my (simple) solution based on the trivial situation: stable IPs, a single server, no weirdness with the URL changing.

      I'm guessing (or hoping, anyway) that anyone in a non-trivial situation is going to know about how to work around their situation to get the right answers.

      --t. alex
      Life is short: get busy!
        check out how awstats does it. the algorithm is basically, for the maximum duration of inactivity in seconds that is considered a end of session look at each source IP and if the next hit is within that range the visit duration is increased by the difference in timestamps, if it is longer than the timeout it is a new session and the duration is reset with that hits timestamp setting the start point. Again, I have to reiterate that without true session accounting these duration stats will be suspect at best even with a simple site -- you are making guesses about inactivity vs sessions. If these numbers mean something to you even after that realization kudos.


        -Waswas