Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

not quite right (was: Unique visits - Webserver log parser)

by legLess (Hermit)
on Feb 28, 2002 at 05:33 UTC ( [id://148142]=note: print w/replies, xml ) Need Help??


in reply to •Re: Unique visits - Webserver log parser
in thread Unique visits - Webserver log parser

ciryon, I'd argue that if you're regularly generating multi-gig log files you need a more high-powered solution than simple IP address analysis. Consider using a service like WebTrends, or rolling your own. If you can create a fast, secure and accurate WebTrends clone for your local site, you'll have done something impressive (a little futile, perhaps, since WebTrends is cheap, but it will be fun).

Merlyn, while I can't argue with you about code (and your enum example below is nice), I think you're exaggerating Alan Flavell's views as he expressed them. He didn't say (in that message or anything else Google could find for me) that "there are no visitors" or that "IPs are meaningless."

Your assertion that "there are no visitors, only hits" is wrong on its face. The vast majority of web users accept 3rd-party cookies, and services like WebTrends do a spectacular job of tracking first-time, returning and unique visitors.

Can you determine exact unique visitors from log files using IP addresses only? No. Should you use IP addresses to identify users or sessions, or as part of a security process? No. These tasks are either futile, dangerous, or both.

But can you use IP addresses to get a "pretty good" idea of first-time, returning and unique visitors? Yes. There are better methods, but they're much more complex. As long as you know your results won't be very accurate, munging a log file with Perl can be a good, cheap solution. Plus it can be a good exercise, especially for a self-described newbie. So what if AOL users are proxied? They don't all use the same proxy at the same time; you can time sessions out after X minutes and improve your accuracy a bit.
You might as well use Perl's rand function instead.
I know this is just hyperbole on your part, but I think it's a disservice to ciryon. He's a novice asking for advice, and I think we owe him honesty.
--
man with no legs, inc.
  • Comment on not quite right (was: Unique visits - Webserver log parser)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://148142]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2024-04-19 05:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found