in reply to Website Statistics

Hmm. I guess what you could do would be to write a CGI that detects the browser, OS and IP of the client, and then uses this data to write to a statistics file/db, checking beforehand that the IP is unique.

This could probably work well for a small site, with a smallish number of hits - remember that you'd have to maintain a list of every single IP that visits your site, and check against it every single time the index got loaded. That could get very slow very fast, especially with a high-volume site.

Your other problem with using the IP to check uniqueness is that IPs aren't unique. The vast majority of the internet (from a client point of view) uses dynamically assigned IP addresses, not static ones, and hence even if the same IP visits twice, you can't guarantee that it's a unique visitor - it could be someone completely different on the same ISP.

Maybe the easiest way to do something like this would be to use one of the logging tools that's already available, and provide nightly/delayed stats for browser/OS use. Webcounters are trivial to write, if highly inaccurate, as there's no real way to count a "unique" hit.

As far as Apache stats tools go, I'd recommend something like Analog, which is a fairly powerful and customisable analysis tool.

Hope that helps a little ..

-- Foxcub
#include www.liquidfusion.org.uk

Replies are listed 'Best First'.
Re: Re: Website Statistics
by debiandude (Scribe) on Jun 02, 2003 at 13:43 UTC
    I see what you mean that checking against all IP's could get really slow. I doubt I would run into that problem since my site is small. However, perhaps a comprimise would be to store the last 20 or so IPs just so I don't count people hitting refresh and more hits.

      I don't think that checking the IP of the present visitor against all IP's which have already visited your site would get really slow.

      The only way to get a value for unique IP's which have visited your site is by putting the IP-numbers which have visited you in a database.

      Something like MySQL is probably the fastest solution although you could do it easily with any type of database which has a DBI/DBD interface.

      Checking and updating your database will be very fast as you can index on the field containing the IP. If you run it under Apache and mod_perl, you can even have persistent database connections so you save on the connect/disconnect overhead and the script stays "compiled" in between hits.

      Whether the values you obtain have any meaning is altogether another issue as there is no sure-fire way to link IP's to people.

      CountZero

      "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law