Hi,
I have a really buggy server that crashes a couple of times a day; notable, the process itself never dies. I analyzed the error logfile and found out that at least two regular expressions precede a hangup of the server. Now comes the problem, the servers logrotation does not work in the intended way. In fact, once a day it rotates the error logfile from errorlog to i.e. errorlog.04072002 but always at a different time. Not enough, at times it does not create a new errorlog file but continues to write in the renamed logfile! So I have to take care if there is an errorlog knowing that that I have to reopen the filehandle if the logfile rotates correctly. My script is running permanently in the background checking continuously (tail -f alike).

And here it is, my current version of the checkscript, that stops and afterwards starts the server if the regex is found in the errorlog. Somehow I still have problems with it because after one restart a lot of them follow. I am pretty sure that it could be done better so that I can guarantee that downtimes are minimized and, of course, are not provoked by the check script itself ;-)
Awaiting your (tested) solutions with great thankfulness,

#!/usr/bin/perl use strict; my $debug = 1; my $serverinst = "/opt/myserver"; my $CLIENTLOG = "$serverinst/logs/errorlog"; my $logfile = "/opt/mon/check.log"; my $pidfile ="/opt/mon/check.pid"; my $errpatt1 = "Internal Error"; my $errpatt2 = "Error 0x0c543 con"; my $polltime = 10; # poll for new file every x seconds open (PIDFILE,">$pidfile") or die "Couldn't open $pidfile: $!\n"; print PIDFILE "$$"; close PIDFILE; open (CL,"$CLIENTLOG") or die "Couldn't open $CLIENTLOG: $!\n"; my $inode = (stat $CLIENTLOG)[1]; ### read to end of file seek (CL, 0, 2); logit(" ### Started $0 with PID $$ ###\n"); LINE: while(1) { logit ("reading") if $debug; while (<CL>) { logit ("read one line"); if (/$errpatt1/ or /$errpatt2/) { wsrestart(); } } if ( -f $CLIENTLOG) { if (get_inode($CLIENTLOG) != $inode) { logit ("inode changed to [$inode]"); unless (open (CL, $CLIENTLOG)) { logit ("ERROR opening $CLIENTLOG: $!"); die "ERROR opening $CLIENTLOG: $!\n"; } $inode = get_inode($CLIENTLOG); next LINE; } } logit ("sleeping") if $debug; sleep ($polltime); seek (CL, 0, 1) } sub logit { my $message = shift; open (LOG, ">>$logfile") or die "Couldn't open $logfile: $!\n"; print LOG localtime(time())."\t$message\n"; close LOG; } sub get_inode { return (stat $_[0])[1]; } sub wsrestart { system("$serverinst/stop"); logit("Server instance $serverinst stopped"); sleep(3); system("$serverinst/start"); logit("Server instance $serverinst started"); }

The waiting ones may be too late,
the impatient could soon have nothing worth to wait for.

-r

In reply to File tracking by r

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.