DigitalBoB12 has asked for the wisdom of the Perl Monks concerning the following question:

am currently working on a project that involves crawling certain websites.

However sometimes my Perl program will get "stuck" on a website for some reason (can't figure out why) and the program will freeze for hours. To get around this I inserted some code to time out on the subroutine that crawls the webpage. The problem with this is that, lets say I set the alarm to 60 sec, most of the time the page will timeout correctly, but occasionally the program will not time out and just sit for hours on end (maybe forever since I usually kill the program).

On the really bad websites the Perl program will just eat through my memory, taking 2.3GB of RAM and 13GB of swap. Also the CPU usage will be high, and my computer will be sluggish. Luckily if it times out all the resources get released quickly.

Is this my code or a Perl issue? What should I correct and why was it causing this problem?

Thanks

Here is my code:

eval { local $SIG{ALRM} = sub { die("alarm\n") }; alarm 60; &parsePageFunction(); alarm 0; };#eval if($@) { if($@ eq "alarm\n") { print("Webpage Timed Out.\n\n"); }#if else { die($@."\n"); }#else }#if

Replies are listed 'Best First'.
Re: Perl alarm working intermittently
by almut (Canon) on Jul 13, 2010 at 18:04 UTC
    Is this my code or a Perl issue?

    It's hard to tell without knowing what exactly parsePageFunction() is doing.  For example, does it by any chance itself (or via some module it's using) modify $SIG{ALRM} under some circumstances?

    You might also try setting PERL_SIGNALS=unsafe.

Re: Perl alarm working intermittently
by afoken (Chancellor) on Jul 14, 2010 at 23:28 UTC

    Not really related to your problem, but it hurts my eyes:

    • &function(); may break your code. If you learned that from Perl 4 times, learn to get over it. The & form overrides prototype checks in Perl 5, and you usually don't want that. Remote the &. The & is mostly gone in Perl 5, it is only used to take a reference to some routine (my $ref=\&somesub;) or to dereference such a reference (&$somesub(...);). The latter also can be written without the & as $somesub->(...);.
    • Indent your code properly. Perl works fine without indenting, but you will soon have problems understanding what happens.
    • Properly indenting your code also removes the need to comment closing brackets. It's obvious simply by the ammount of indenting.
    • (Not in your code:) If you have so much code inside a pair of brackets that you can not see both the opening and the closing bracket at the same time, consider moving that code into a new subroutine.
    • (Personal style:) Don't write non-functions in a way that they look like functions. if is not a function, so leave a space between if and the opening brace.

    Compare your code

    eval { local $SIG{ALRM} = sub { die("alarm\n") }; alarm 60; &parsePageFunction(); alarm 0; };#eval if($@) { if($@ eq "alarm\n") { print("Webpage Timed Out.\n\n"); }#if else { die($@."\n"); }#else }#if

    with properly indented code:

    eval { local $SIG{ALRM} = sub { die("alarm\n"); }; alarm 60; parsePageFunction(); alarm 0; }; if ($@) { if ($@ eq "alarm\n") { print("Webpage Timed Out.\n\n"); } else { die($@."\n"); } }

    It doesn't really matter which indent style you use, but use one, and use it always. There is no perfect indent style, so find the one you like best or use the one that your team uses. (My personal preference bases on 1TBS, with only tabulators used for indent, allowing any indent width by modifying the tabulator width.)

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)