MatheusAA has asked for the wisdom of the Perl Monks concerning the following question:

Hey guys, this is pissing me off. I've written a perl script wich is a crawler (something that searches for information on web sites). After some time executing, the script just freezes, stops, and does nothing else. I don't even get any error messages.

I think it's a memory issue, cause the crawler stops just after a line like this:

$fileContent = ` cat $DIR_INPUT/$FILE `;

I'd really, really apreciate if anyone could help.

thanx in advance.

UPDATE:

MidLifeXis, it stops right after that line. if I try to print the var $fileContent after the cat command, I get an incomplete html page source.

here's some more code:

$DIR_INPUT="./CAPTURE"; while($i <= $npages){ $FILE = $i . $IN_FILE; &wget($URL . $PAG . $i, $FILE, '', ''); #sub that call the wg +et &capture($FILE); } ... sub capture{ my $FILE = $_[0]; $fileContent = ` cat $DIR_INPUT/$FILE `; print $fileContent; # Here's where it stops ... }

As I said before, after some iterations of the while loop, the program suddenly stops, and the print I get on screen is an incomplete page source.

Replies are listed 'Best First'.
Re: Perl Script stops in middle of execution, no error messages
by Perlbotics (Archbishop) on Nov 11, 2008 at 17:53 UTC

    Hi, something you could try:

    • Use of undefinded/empty/mistyped variable? → add use strict; use warnings; at the begin of your program and run again
    • Reading from STDIN? → enter Ctrl-D (Ctrl-Z on Windows) and see if the script continues
    • What is read? → add a print "Reading: ($DIR_INPUT/$FILE)\n"; before the cat call
    • Whitespace or special shell-characters in filename? → use `cat "$DIR_INPUT/$FILE" `
    • Reading from FIFO? → check if the file your reading from is a regular one (e.g. warn("not a plain file - $FILE!") unless -f "$DIR_INPUT/$FILE");)
    Seeing something unusual?

    Update: Do it all with Perl - forget the cat and slurp the file; see How can I read in an entire file all at once?
    Update2: If your environment/requirements allows that: Forget the slurp and read the page directly into $fileContent using e.g. LWP::Simple...

      Thank you everyone. I still don't know if it was something related to perl itself, or to the network. Anyway, specially thanks to Perlbotic tip about LWP::Simple. It is helping me much and seems to have solved the problem. cya
Re: Perl Script stops in middle of execution, no error messages
by MidLifeXis (Monsignor) on Nov 11, 2008 at 17:12 UTC

    Is it stopping just after, or on that line? For example, if $DIR_INPUT == "/dev", and $FILE == "zero", this could happen. Do you have a print statement before and after the statement you think is having problems to verify that this line is actually the culprit? It is hard to tell what is happening without the context.

    Other than that, please include some more code so that there is some context.

    --MidLifeXis

Re: Perl Script stops in middle of execution, no error messages
by MidLifeXis (Monsignor) on Nov 11, 2008 at 17:52 UTC

    Try this:

    sub capture{ my $FILE = $_[0]; warn "FILE=$FILE"; $fileContent = ` cat $DIR_INPUT/$FILE `; warn "AFTER=$FILE"; print $fileContent; # it stops here }

    Also, are you seeing the incomplete page source in your browser, or are you running the script from the command line? If from the browser, check your error log. There is probably information in that file that lists what is going wrong. If a die or something failed and killed the program, this might give a clue where the problem lies.

    You could also set select STDOUT; $|=1 in your code to ensure that everything is forced out to STDOUT, or you may be suffering from buffering.

    --MidLifeXis

Re: Perl Script stops in middle of execution, no error messages
by swampyankee (Parson) on Nov 11, 2008 at 17:34 UTC

    Use html's <p></p> tags to put in linefeeds; it's explained in Markup in the Monastery. Also, please wrap code samples in <code> and </code> tags; this will make the code easier to read.


    Information about American English usage here and here. Floating point issues? Please read this before posting. — emc