trelane has asked for the wisdom of the Perl Monks concerning the following question:

The below script has been working well for over a year, but 3 weeks ago something changed on the networks side which is now causing the script fail due to network timeouts. Can anyone suggest a way to make this script more robust?
#!/usr/bin/perl # SNMP reset tool #use strict; $SIG{PIPE} = sub { print "Ignoring SIGPIPE\n"; }; chdir("/home/ossutils/snmpreset"); #Read arguments $input = @ARGV[0]; #set up report file $result = `date '+%Y%d%m%H%M%S'`; chomp($result); $result = "snmpreset-$result.log"; open(REPORT, ">>$result"); #Open sockets use IO::Socket; #get last log file @lsout = `ls -1tr /u01/appl/bea/logs/log01_*`; print "DONE FIRST LIST\n"; $cnt = scalar(@lsout); $inlog = (@lsout[($cnt - 1)]); open(INPUT, "$inlog"); print "start init read\n"; while (<INPUT>) {;} INPUT -> clearerr(); print "done with initial read\n"; print "start endless look\n"; for (;;) { while (<INPUT>) { if ($_ =~ m/Reboot Device with MAC address/) { $mac = $_; $mac =~ s/^.*address: //g; $mac =~ s/ Thread.*//g; $mac1 = $mac; $sock = new IO::Socket::INET( PeerAddr => 'IP1', PeerPort => '5000', Proto => 'tcp', ); $sock1 = new IO::Socket::INET( PeerAddr => 'IP2', PeerPort => '5555', Proto => 'tcp', ); $sock or next; #or print "no socket :$!"; print $sock "$mac"; print $sock1 "$mac1"; close $sock; close $sock1; $tm = `date '+%Y%d%m%H%M%S'`; chomp($tm); chomp($mac); chomp($mac1); print REPORT "$tm - $mac reset request sent\n" +; print REPORT "$tm - $mac1 reset request sent\n +"; } } @lsout = `ls -1tr /u01/appl/bea/logs/log01_*`; $cnt = scalar(@lsout); $inlogb = (@lsout[($cnt - 1)]); if ($inlog ne $inlogb) { close(INPUT); open(INPUT, "$inlogb"); $inlog = $inlogb; } INPUT -> clearerr(); } print "Terminating incorrectly\n"; close(INPUT); close(REPORT); exit;
Thanks :)

Replies are listed 'Best First'.
Re: session expiry
by Corion (Patriarch) on May 25, 2010 at 11:10 UTC

    You haven't told us where/how your script started to fail, so that makes it rather hard for us to suggest how to make the point of failure more resilient.

    As an aside, you might want to replace `ls ...` with File::Glob::bsd_glob or glob (and sort maybe), and `date ...` with POSIX::strftime '%Y%m%d%H%M%S' localtime, which will keep things in Perl.

      This is the output from snoop, as far as i can tell we are trying to send the MAC to the remote server and it fails to connect to the remote port within the required time, which causes the script to fail. This is the example from a failure:
      IP1 -> sat03-ce2 TCP D=5000 S=56412 Ack=2181156585 Seq=3603286738 +Len=0 Win=49640 sat03-ce2 -> IP1 TCP D=56394 S=5000 Fin Ack=1768367342 Seq=1949488034 +Len=0 Win=24820 IP1 -> sat03-ce2 TCP D=5000 S=56394 Ack=1949488035 Seq=1768367342 +Len=0 Win=49640 sat03-ce2 -> IP1 TCP D=56398 S=5000 Fin Ack=2343943522 Seq=3248338299 +Len=0 Win=24820 IP1 -> sat03-ce2 TCP D=5000 S=56398 Ack=3248338300 Seq=2343943522 +Len=0 Win=49640 sat03-ce2 -> IP1 TCP D=56400 S=5000 Fin Ack=2222230736 Seq=3317204210 +Len=0 Win=24820
      This is the example from a successful message:
      IP1 -> sat00-ce2 TCP D=5555 S=56377 Ack=2308315944 Seq=2692239599 +Len=0 Win=49640 IP1 -> sat00-ce2 TCP D=5555 S=56395 Ack=3480489146 Seq=2257027015 +Len=0 Win=49640 IP1 -> sat00-ce2 TCP D=5555 S=56413 Syn Seq=1640347427 Len=0 Win=49640 + Options=<mss 1460,nop,wscale 0,nop,nop,sackOK>

        You show a network dump. What relation does that network dump bear to the program? Where does your program fail? Does your program output some error message? What have you done to determine the cause of the error message?

        There are no exit or die statements in the code you posted. How does your progam "fail" then?