in reply to Web Site Monitoring

Netsaint is your fully formed, well functioning, wheel you do not need to reinvent. Netsaint is a complete framework for monitoring systems and websites. It handles scheduling, notification, and contains a very comprehensive set of tests right out of the 'box'.

It is also completely modular and you can write tests in perl (I have written several).

grep
grep> rm -f /bin/laden

Replies are listed 'Best First'.
Web Site Monitoring, a few tips you might like.
by Dog and Pony (Priest) on Feb 15, 2002 at 09:34 UTC
    I agree about Netsaint, there is hardly any point in reinventing the wheel if you don't have reasons (such as I want to :) ).

    Anyway, if you do decide to roll your own, here are a few pointers you might find useful - been there, done that.

    • Be very careful when defining what is "up" and what is "down". The script should not report that everything is down just because the server is a bit strained. Give it a second chance.
    • Be very, very careful when defining what is "up" and what is "down" if this desicion will make your script take any automatic action (maybe restart the application or something - on one hand, a restart might freshen everything up, on the other hand, in some designs it could cause customers to lose their session - and their carts with it!).
    • Consider making a special "stats" page for your script to access, so that the server can fetch some data as well. Perhaps the load on the server, how many sessions deemed active... XML is nice for this. Make it protected, though.
    • Consider not using a special page for the script. At least make sure you just don't test "index.html". Make the script test several pages, and if possible, simulate a flow over several pages (yes, this takes some coding).
    • Email is your friend. Mail yourself warnings when certain criteria is filled. Maybe mail yourself if things start to look good again too, so you will know that too.
    • Email is your enemy. This pertains to the first points, about defining when "down" really is. If you get several mails a day, and especially if you are getting false alarms, you will very soon start to ignore the mails. Do not send mail unnecessarily.
    • Log as much as possible. Anything you can think of might help later.
    • Log as little as possible. You don't want to sift through an apache-access sized log file to get the facts you need. Make sure you can easily find the facts you need in the logs, via timestamps and such. Also use a special UserAgent header for your surfing script (for the normal weblog).
    • Let the script surf from someplace else, outside your firewall, preferably from some totally different location. Overseas would be great. :) Otherwise, something besides your site might be down, and you wouldn't know.

    Of course, there are tons, and tons of more things, but these points I could think of right away, and I know several of them would have helped me, had someone told me. :) As you can see, most of the points contradict one or several of the other points. This is intentional - both sides are correct to some extent, and the idea is to find the balance. For instance, a surfing type of script that times out after 10 seconds, reporting that the site is dead, is most likely a very bad idea - but so is a script that takes 10 minutes. Maybe a one minute timeout, with a doublecheck would be appropriate? Only you can answer that.

    I hope these ideas gave you some hints on how to go about it. :)

      OK Let's try this again. Thanks to all of you for your replies. Lot's of good pointers and items I had already considered. Since my replies don't seem to be staying with the right subthread, I figured I would write this one.

      • Netsaint sounds good, and may get me up and running quickly
      • I would still like to write something using LWP or HTTP::WebTest
      • I appreciate all of the tips, even the contradictory ones. I recognize that is how the real world works and that your providing these tips is a dangerous thing to do, since you don't know my apps or environment.

      To all who replied, thanks. If I end up building something, I'll post it so you can "rip it to shreds" :> -Rich

        You can combine usage of Netsaint and power of Perl and LWP or HTTP::WebTest. As grep have said you can write Netsaint plugins in Perl.

        It is quite good approach. Netsaint gives you powerful monitoring framework which provides messaging, web interface, ready to use plugins for many services, etc and LWP or <shameless plug>even better HTTP::WebTest</shameless plug> gives you ability to write very complex tests which can cover all functionality of your web applications.

        BTW one user of HTTP::WebTest have sent me a script - plugin for Netsaint which uses HTTP::WebTest to test websites. I don't feel it is generic enough to make it public but if you want I can email you it. Drop me email to ilya@martynov.org if you need it.

        --
        Ilya Martynov (http://martynov.org/)

Re: Re: Web Site Monitoring
by rah (Monk) on Feb 16, 2002 at 17:09 UTC
    Thanks for pointing me towards NetSaint. It may be just what I need to get going quickly.
Re: Re: Web Site Monitoring
by rah (Monk) on Feb 16, 2002 at 16:54 UTC
    Thanks for the tip. I have been searching for something off and on for a year and hadn't come across this. I will definitely look into it.