qadwjoh has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I'm writing a set of CGI scripts that run a Perl script on a clustered Win2K server set up consisting of 2 nodes and a shared drive.

I've been told that my program must be able to failover from one node to the other, should anything happen one of them. Can anyone explain or point me in the direction of how to write a program to deal with this and how to program for clusters in general?

Thanks,
A

Replies are listed 'Best First'.
Re: CGI script on cluster server
by ant9000 (Monk) on Jul 10, 2003 at 14:38 UTC
    There's no magic ideas in programming for cluster the way you need here: the point is just that serialized data should be accessed in a safe way (think about file locking), and that if you have long-running processes they should save the state as often as possible, and be able to restart their work from the saved state when needed. Which are indeed Very Good Ideas (tm) anyhow, so you should probably think about implementing them in every CGI you write...
    If you can live with one process being terminated on server1, and a fresh one being started in its place on server2, without the need for the two processes to communicate any state info, you have only the problem of making sure that if something goes wrong and your program terminates abnormally, no havoc is left around on the system. Again, this is something you really should do in any case...
    Just my 2c!
Re: CGI script on cluster server
by jmanning2k (Pilgrim) on Jul 10, 2003 at 14:56 UTC
    If you want failover, then you probably need to look at an external program to manage that. Heartbeat is one such program.
    This handles one server freezing, crashing, etc.

    Unfortunately this isn't much help for you, as I don't think it runs on Windows yet.
    MS has a clustering solution, but it's only on Win2k Advanced Server.

    Unless you're already running advanced server, and can write a COM+ perl script, then your best solution may be to take the ideas of heartbeat and adapt them to your script.
    Open up a TCP socket, and have the programs ping each other back and forth. Typically the secondary will ping the primary. If the primary doesn't respond, then the secondary should take over.

    A few issues to watch out for:
    Primary Stops (crashes or freezes or program dies)
    Primary Stops, but comes back up after secondary takes over (secondary needs to stop, or primary needs to realize that they have switched roles)
    Depending on the application, you will have to make sure that the switch from primary to secondary is invisible to the user, this usually involves ip address changes or dns tricks.


      Open up a TCP socket, and have the programs ping each other back and forth. Typically the secondary will ping the primary. If the primary doesn't respond, then the secondary should take over.
      Actually, couldn't you write a different script that does the pinging, and then fires up the needed process on one system or the other?

      They'd each be identical. If the process was running, no new action. If the process isn't running, ping the identical script on the other machine. If the other machine's not there, start the process on the current machine.

      If they were both running at once, one had to start, and if that one started, then the process is running. Each freezes current working data in a known location, and if the process fails for one, then it tells the other to start its process up with the working data. If the process gets no ping and no process is running, it looks in the known place for the last set of working data and uses that. If there's no last set, it just starts the process.

      Just my $0.02...

      -----------------------
      You are what you think.

Re: CGI script on cluster server
by ozzy (Sexton) on Jul 10, 2003 at 23:45 UTC
    I do not believe you have to do any special programming to accomplish this. If the CGI script is compiled as an executable ( as any program ) and run as a CLUSTER RESOURCE, it doesn't really matter where the resource is active on NODE 1 or NODE 2 ( assuming both nodes are mirrored and clustering is active/passive ). Theoretically this sounds great.
      Thanks for all the help guys.

      Basically the set up is this: we have a cluster of 2 nodes and a shared drive on which my CGI scripts will sit. The idea behind this being that if one node goes down, the other can take over - load balancing is not a concern.

      So if node A is in the middle of running my CGI script and goes down for some reason, how does node B take over? Does it automatically re-run my script from the beginning with the same paramters leaving me to co-ordinate this? Or can a clustered server co-ordinate the second node such that it automatically takes over at the point of failure?

      thanks,
      A

        What you are asking about I believe is actually beyond the scope of this on-line community's regular discussion.

        However, let me say this about what you are asking about since I have some small experience with what you are talking about.

        Perl CGI is not going to have any mechanism built into it for clustering and recovery of "in-flight" transactions. If you lose a cluster node in the middle of the execution of something as transient as CGI code, sorry it is going to be lost.

        You can do stuff though to help in recovery of the user experience. What I'm going to suggest is not all inclusive, but will help.

        If your scripts have some notion of a "session" and you use some sort of back end persistance storage (shared of course) to store the state of your session along with enough hints at least then your user can hit "refresh" and continue their session. However this is not foolproof and should be implemented carefully with proper session timeouts etc.

        You will notice I have not mentioned cookies. These can be used as well for sessioning and providing the CGI with arecovery point. My personal preference is to not use them but to code a SESSION_ID in my HTML code that is feeding the CGI.

        Hope this helps.


        Peter L. BergholdBrewer of Belgian Ales
        Peter@Berghold.Netwww.berghold.net
        Unix Professional
Re: CGI script on cluster server
by exussum0 (Vicar) on Jul 10, 2003 at 20:27 UTC
    Well, if your nodes are what contains the CGI program, and the servers themselves manage themselves in terms of who is active, inactive and various other statistics such as load...

    Think of it as writing two programs that share resources on one machine. All clusters do is distribute load, so if your program can't work on one machine with two copies running, then two machines running a copy each won't work.

    You may have to worry about, as in regular multi process/thread code, locking of resources. If it's db, file, anything else...