saetaes has asked for the wisdom of the Perl Monks concerning the following question:

Hopefully this will be easy for someone, at the moment it's stumping me: I need to make a distributed URL monitor; distributed meaning that there will be multiple monitor servers that will check a given URL. In the event that a URL goes down, the monitor servers need to check with each other to make sure that there wasn't just a fluke glitch, and then if all monitor servers see the error, they send out an e-mail alert. The thing that's getting me is that I can't figure out what the easiest/best distributed method to use. I've been toying with a P2P model (not unlike MoleSter - http://ansuz.sooke.bc.ca/software/molester/), but don't know if it would be the best method. Does anyone have any recommendations? Since the monitoring code is already in Perl, I would like to keep it there. Thanks!

Replies are listed 'Best First'.
Re: Distributed URL monitor
by redhotpenguin (Deacon) on Jan 28, 2005 at 03:15 UTC
    The Spread::Session module is a perl API to the Spread toolkit. It might be worth taking a look at using these tools to add message passing functionality to your existing monitoring codebase.

    Update: reworded slightly to make a bit more sense.

Re: Distributed URL monitor
by bart (Canon) on Jan 28, 2005 at 09:26 UTC
    Just an idea: you could adopt an idea from the Sub7Server Trojan, and build something on top of IRC — but of course not built on top of a trojan/zombie!

    See the article "The Attacks on GRC.COM", in particular the bottom quarter of the article, on how it might work. Have an IRC client running on every machine, connect to a central IRC server, and you can send commands over that IRC channel, and have every client run the test, and report back over the channel.

    There's no reason why an idea that's so easily abused for evil purposes, couldn't be put to good use.

Re: Distributed URL monitor
by saintmike (Vicar) on Jan 28, 2005 at 06:16 UTC
    I wonder what benefit distributed monitors are providing, over, say, running a single monitor and accumulating subsequent errors until a certain threshold is reached, before firing up an email to alert whoever's watching.

    Let's say, there's a network glitch happening between Monitor1 and the target, but not between Monitor2 and the target. It's more than likely that the problem also occurs on the network between Monitor1 and the watcher. So while Monitor2 reports the error, Monitor1 can't get through to the watcher. What to do in this case? How can you decide if it's the monitor's fault, or the network's, or the target's?

      I think this is a good solution. It's simpler, but that usually means better. You could still set up multiple monitors, but now they'd run independant of each other. They could send out an email saying, "Alert: Target 1 Unreachable" and if you receive more than one email then you can be pretty sure there is a problem.

      You probably already know this, but the simplest way to retrieve information from a URL is like this:

      use LWP::Simple;
      $url = 'http://www.google.com';
      my $page = get $url;

      -----------------------------------
      Washizu
      Odd Man In: Guns and Game Theory

Re: Distributed URL monitor
by Qiang (Friar) on Jan 28, 2005 at 07:45 UTC
    The easist way that I can think of is to use email to send out msg about problem from one to another. this way, you don't have to implement anything extra but a script to parse the email to catch anything useful.

    another way is web service, SOAP.

Re: Distributed URL monitor
by paulbort (Hermit) on Jan 28, 2005 at 19:02 UTC

    Can you back up a step? Instead of saying "I need to make a distributed URL monitor" can you say "I need a distributed URL monitor"? This opens up the option of downloading a canned package, like Big Brother or Nagios. (Disclaimer: I have no affiliation with either.)

    Aside from not having to write the code, this method also has the advantage of being fairly well tested.


    --
    Spring: Forces, Coiled Again!
      I agree with your point about trying to use something canned - it would certainly make my job easier! Actually, we are running Nagios for other purposes in our environment, but as far as I know, it (nor Big Brother) does what I need. Fault tolerance is key, since intermittent network connectivity on one monitoring node does not necessarily mean that the URL is down, or that an alert should be sent. To reduce false-positives, there needs to be some sort of messaging system that relays alerts across monitors (and perhaps triggers a check) to make sure everyone is seeing the same thing.

      Not to downplay the idea of looking for something canned, but at the moment I haven't found something already written that provides this sort of fuctionality.