raflach has asked for the wisdom of the Perl Monks concerning the following question:

Hi All,
I wrote a small code distribution script for N-Tier applications, that uses NET::Telnet to do backups of remote servers before pushing the code, and to explode the code.tar after it has been pushed. This worked great for a good long time, but now suddenly, with no changes to my code, the script is core dumping occasionally during the backup portion. It doesn't happen on the same server every time, or even happen at all every time, so I suspected some kind of timeout problem was causing the core. If any- one can give me an idea of what might be causing this, I would be most appreciative. The specific code in question follows. (Feel free to e-mail responses to robert.a.flach@mail.sprint.com as well as or instead of posting them here, if my question is not of general interest)

code snippet
foreach $serv (@servers) {
   $ses = new Net::Telnet(-host => "$serv",
           -timeout => "1000"
           -errmode => sub { wait; } );
   $logtxt = "Now telneting to server $serv";
   &logit;
   $logtxt = "Starting backup process for server $serv";
   &logit;
   $ses->login($user,$pass);
   $hold = $ses->cmd("cd /$settings6");
   $logtxt = "Directory /$settings6 Entered";
   &logit;
   @hold = $ses->cmd("tar cvf archives/$settings[0]_${time}_bak.tar $settings[0]");
   $logtxt = "Backup File Created:";
   &logit;
   @hold = $ses->cmd("compress archives/$settings[0]_${time}_bak.tar");
   $logtxt = "Backup File Compressed:";
   &logit;
   $ses->close;
   $logtxt = "Backup for Server $serv Complete";
   &logit;
   $ses->DESTROY;
   $logtxt = "Telnet session ended";
   &logit;
}
End of Code Snippet
P.S. I couldn't get the brackets to display properly around the numbers on the arrays, but they are there.

Replies are listed 'Best First'.
Re: Net::Telnet core dumps
by chromatic (Archbishop) on Apr 19, 2000 at 19:55 UTC
    Hmm, the examples in the perldoc for Net::Telnet suggest opening sessions like so:
    my $ses = new Net::Telnet(); $ses->open(Host => $hostname, Port => $port);
    Many of them don't use close() (but it's a good idea). None of them that I've seen use DESTROY. Perl should call this automatically for you at the end of the block, when $ses goes out of scope.

    You can debug timeout errors with the input_log() and dump_log() methods. The first shows filtered output from the remote session, and the latter shows unfiltered. Something like:

    $logtxt = $ses->dump_log(); &logit;
    might do the trick.
      Hmmm... If I wasn't getting core dumps, but some other type of non-crashing failure, these might work for me, but I don't think that they will do any good in my situation. I hope I am wrong. Am I?
      Note:
      NET::Telnet defaults to the standard telnet port if no port is specified. which is why I don't have a port specified. The port option is there to allow the module to be used for things other than standard telnet, like muds, or even completely non-standard network communications.
      I constantly cored when I didn't specify a timeout in the initial code, which is why I added that param, but increasing it above 1000 doesn't seem to have any affect one way or the other.
      The errmode param, I can't remember why I added, and it may not be needed.

      My error is occuring prior to the call of DESTROY, so I'm fairly sure that is not the issue, but it may be superfluous code, so I'll take it out, test it, and see.
      Thanks for your help.
        I'd get rid of the errmode, anyway, or at least change it to a simple return call. That way, if there is an error, you can call errmsg() and see what it is.

        I've seen signal handlers die randomly sometimes. (In the perlipc manpage, it says that some system libraries aren't reentrant, and trying to do too much in a sighandler can cause nasty things like coredumps.) That doesn't look to be the case in Telnet.pm, as far as I can see.

        wait just looks wrong. Why would you need to wait for a CHLD signal?

Re: Net::Telnet core dumps
by btrott (Parson) on Apr 19, 2000 at 21:41 UTC
    That's interesting that it actually core dumps. Can you load the core into a debugger and do a stack backtrace? Perhaps that would give you a clue, or at least something you could post somewhere?

    The things that usually cause core dumps in Perl are, I think:

    • out of memory issues
    • flaky builds of Perl
    • signal handling
    Signal handlers are buggy in Perl and they cause core dumps, quite often. And Net::Telnet is probably using SIGs to handle the timeouts, so... perhaps there's something to build on, there?

    One other thing, related sort of to the second item in the above list: did you recently update/reinstall/rebuild your Perl?

      Didn't rebuild perl.
      Script is running on a SUN server with 4 processers, and over a gig of memory, with few users and notmany daemons, so it's not a memory issue
      I'll see if I can figure out how to examine the core in a debugger.
      Does anyone perhaps a more elegant solution for accomplishing what I'm trying to do than NET::Telnet?
        Well, you could always have the servers run the script regularly themselves (as a cronjob) and then have them email the results, or even telnet back. You could also have the main server email the remote servers to a specific account, and have the remote servers (assuming they can accept mail) run the backup program when mail from a certain name and IP arrives.

        P.S. Make a left bracket by typing [ and a right one by typing ]