stonecolddevin has asked for the wisdom of the Perl Monks concerning the following question:

I haven't done much research on this so this is probably going to be a weak node but I thought I'd ask anyway...

I'm wondering how one might develop a small and simple system to moniter the amount of data transfer on a site? Like Megabytes per day/month/hour etc...
I had a few ideas like getting the size of each file by "crawling" the directories, putting their names in a database, and have each link on the web site refer back to the crawl script which would then increment a "download count" column in the DB table, and the end result would be some calculations to find how much data was transferred at a given time.
Would this work? and if so, what would be the best way to do it?
Thanks in advance monks.
meh.

2005-12-02 Retitled by planetscape, as per Monastery guidelines
Original title: 'Bandwidth control/monitering'

Replies are listed 'Best First'.
Re: Bandwidth control/monitoring
by davido (Cardinal) on Dec 01, 2005 at 16:58 UTC

    Apache allows you to configure its server logs to show the size of the documents requested on server hits, minus the HTTP headers. So just parse the log file adding up the file sizes, and inflate a little for the HTTP headers.


    Dave

Re: Bandwidth control/monitoring
by marto (Cardinal) on Dec 01, 2005 at 16:52 UTC
    Hi dhoss,

    If you have access to the http logs would you not be better to use that data?
    Have you seen AWStats? If it does not do everything you need 'out of the box' you could always customise it.

    Hope this helps.

    Martin
      ohhh very good point marto and whoever else pointed this out, I should just parse the access logs, that would be quickest and easiset wouldn't it?
      meh.
Re: Bandwidth control/monitoring
by McDarren (Abbot) on Dec 01, 2005 at 18:26 UTC
    As already mentioned, MRTG is an ideal tool for this type of job.

    One limitation of MRTG is, however, that it only allows you to display two sets of data in one graph. Something else that you may wish to look at is RRDTool (written by the same person as MRTG). RRDTool is much more flexible than MRTG, and there is a handy Perl module available.

    I have worked quite a bit with both, although my preference is RRDTool, because of its greater flexibility. For monitoring network traffic, my approach was to examine the values in /proc/net/dev on a regular basis (on a RH Linux box - well, actually around 230 linux boxes ;). I've also used it for other stuff such as monitoring the number of users on a network, monitoring latency, etc. I'd thoroughly recommended it.

    cheers,
    Darren :)

Re: Bandwidth control/monitoring
by zentara (Cardinal) on Dec 01, 2005 at 17:05 UTC
    This isn't a pure Perl answer, but I suggest you look at tcpick

    It is like a minimalist's ethereal, and will monitor and/or log all traffic on an interface, like eth0, or ppp0. You probably could run it from Perl, accumulate size data on the packets, and do some long term logging into a db.


    I'm not really a human, but I play one on earth. flash japh
Re: Bandwidth control/monitoring
by andyford (Curate) on Dec 01, 2005 at 17:54 UTC
    Depending, you might want to get into full-blown network monitoring. This might seem like overkill, but it's really not as bad as it sounds:
    find out if your system is running or could run an SNMP daemon and then use MRTG for network data collection and graphing. If you have no SNMP daemon on the server, you might also be able to SNMP query the network router or switch instead of the server to get the traffic in and out of the server's interface.
      You might also want to consider RTG. It's similar to MRTG is what it does - monitor bandwidth, etc - but it uses MySQL instead of RRD for data storage. The result is that old data isn't averaged. I've been using it for a couple of years now and it's great.

      Jack

Re: Bandwidth control/monitoring
by neilwatson (Priest) on Dec 01, 2005 at 18:07 UTC