2501 has asked for the wisdom of the Perl Monks concerning the following question:

I am on a NT machine where there are many people coding many perl scripts. We have the suspicion that there is a "runaway" script which gets run on occassion that loops infinitely and eventually causes crashing. This bad script has proven to be most elusive because the script is most likely being run as a CGI.
Is there a way to tweak the call to perl so that it could log what scripts call what files and when they end successfully so that when we go in the loop next time, we could consult the log for the culprit?
Thanks for your time,
2501

Replies are listed 'Best First'.
LogScripts.pm
by Fastolfe (Vicar) on Dec 03, 2000 at 13:13 UTC
    You may be able to re-associate Perl files in your web server or in NT to a slightly modified Perl command line: "perl.exe -MLogScripts". Then use a module like this to log the start and stop of the script:
    package LogScripts; # If you want high-res timekeeping: # use Time::HiRes qw{ time }; my $started; BEGIN { open(F, ">>/logdir/scripts.log"); print F localtime(time) . " $ENV{REMOTE_ADDR} - Starting $0\n"; $started = time(); } END { my $elapsed = time() - $started; print F localtime(time) . " $ENV{REMOTE_ADDR} - Stopping $0 ($elaps +ed seconds)\n"; close(F); }
      Thank you all!
      ++'s around for all great answers!
      BBQ : I agree with you, but switching is not an option. Got to deal with the tools you are given sometimes.

      $code_or_die: Very informative! I will definitely check out the software.
      Fastolfe: Perfect! I think that was exactly what I was looking for. I couldn't use $0 because the difference between 90% of the scripts is the script path (and thats about it) so i tweaked it abit to the following:
      package LogScripts; # use Time::HiRes qw{ time }; use File::Spec::Functions qw(rel2abs); my $started; my $path; BEGIN { open(F, ">>c:\\scripts.log"); $path = rel2abs($0); $started = time(); print F localtime(time) . " $ENV{REMOTE_ADDR} - Starting $path\n"; } END { my $elapsed = time() - $started; print F localtime(time) . " $ENV{REMOTE_ADDR} - Stopping $path ($el +apsed seconds)"; close(F); }
      Once again,
      thank you all:) I learned a good deal from everyone.
(bbq) Re: tracking files
by BBQ (Curate) on Dec 03, 2000 at 12:34 UTC
    I suppose that the question is actually if there is a way to log when CGI (using ActivePerl) goes bad under NT. Correct? I guess this is another "it depends" question.

    To the best of my (limited) knowledge, this is much more of a webserver issue than it is an OS issue. Since IIS is somewhat tied up to NT (and vice-versa), consider the following:
    • IIS The webserver spawns the process, which gobles memory, CPU, the machine crawls to a halt, and before the script can timeout, or write a bad line to the error logs, the server is dead. Sounds familiar? I've never found a way around it, except for:
    • Apache Acording to the documentation it runs as "experimental" on the NT boxes, but it kicks IIS but when it comes to respecting the process it spawned. Instead of using up all of the servers' CPU and memory, it will promptly timeout the process, and return a 500 to the browser (after a while). Since we are on the webserver comparison issue, may I also note that Apache on NT implements SSI correctly, and not that sorry excuse for server-side includes that MS named SSINC.DLL

    Bottom line? There's probably some way to do it with IIS, but I'd drop it in favor of a better webserver if possible. If it isn't possible, there must be some way to shorten the timeout period by tweaking the registry or something of the sort. I prefered not to go there. Trying to implement this from within the CGI sounds even more troublesome. If you're going to open each and every script, you might as well track down the faulty one and remove the infinite loop.

    My US$0.02.

    #!/home/bbq/bin/perl
    # Trust no1!
      I'd like to add some support to this. We're developing a rather hairy application under IIS. Essentially, it's a 15-year-old single-user DOS application that we're web-enabling. We're mucking around in this ancient code to put XML interfaces everywhere to its API so we can get to its internal structures. Perl is the glue binding all of this together.

      Progress is slow, and failures tend to be spectacular as doing CGI and XML processing in C is No Fun (tm).

      Sometimes the backend throws an "APPLICATION ERROR" (segfaults), sometimes it just...spins off and never quite stops running, sometimes it just stops. All of these cause IIS to have hissy fits and the development machines have to be rebooted completely to get rid of the rogue processes.

      We've discovered that it's FAR easier to fire up Apache in a command-window as a development environment and then port to the IIS system for testing. We wind up with better code (portability!) and since Apache is a user process, its children can all be killed with the process monitor. Things go awry, we simply click in Apache's window, ^C, wait and restart it. If things bugger up completely, we kill Apache and then hunt down the errant process and kill it with the process monitor.

      There is a way to do it with IIS. You need to configure the application to run under the IWAM rather than IUSR account but telling it to run as a separate process. If it crashes it will not take down the rest of the server.

      You can use the MetaBase editor (downloadable from Microsoft's web site... do a search for "MetaEdit" in the knowledge base) to configure the number of times it will be restarted in the AppOopRecoverLimit key when it crashes (default is 5... set it to something much higher if you'd like).

      That said, take a closer look at your scripts. I find scripts which use GD or which use DBI with a large LongRealLen value are more crash prone. Also, scripts which use DBI and do not properly finish statement handles are also crash prone, especially if the SQL query sorts large amounts of data.

Re: tracking files
by $code or die (Deacon) on Dec 03, 2000 at 23:24 UTC
    ++Fastolfe for the previous post.

    If you have multiple websites running on the NT server- I.e. you are a webhosting company or whatever, you could also try the following:

    Make sure that all webstes applications are running in their own memory space/isolated process.
    This will allow you to see the resources for each website in task manager. - There will be lots of MTX processes. (1 mtx for each website I think). You will be able to see which one is consuming all the resources.
    You won't know which MTX relates to which website, so download the freeware HandleEx from SysInternals
    HandleEx will give information pointing to which IIS website is causing the problems.

    You will then have narrowed things down a lot. Having websites run in their own memory space is a good idea because it means that if one website crashes, the others will not be affected.
Re: tracking files
by Specimen (Acolyte) on Dec 05, 2000 at 04:50 UTC
    You are right, it is probably a while loop which doesn't get incremented - in all my cgi on NT stuff with perl this was the cause for machine seizure (appart from one occasion where someone had admirably managed to set up a situation where an object was recursively exending itself (or something like that)).
    Just grep for 'while' in all your code and check the blocks - it can't be too many of them i think and its pretty fast to check them... good luck.

    Specimen
      We had a fun one once. A module defined an AUTOLOAD subroutine; the first it did was call a method named debug(), to report that AUTOLOAD had been called. Unfortunately, a code change left the debug() method undefined. So, AUTOLOAD was called to create it. Of course, the first thing AUTOLOAD did was to call debug(), which wasn't defined yet, so AUTOLOAD was called, which called debug(), which called AUTOLOAD, which... Until finally it bombed out from too many levels of subroutine calls. :D

      Moral: If your AUTOLOAD sub is going to call other subroutines, make sure they've been defined!