lazybowel has asked for the wisdom of the Perl Monks concerning the following question:

Hi, i am trying to write a script that will scrape a page and display the captcha image for me to enter and then continue with the input however ive run into a problem, here is the code i have so far...
my $page = $mech->content; if ($page =~ /Please enter/) { #print $mech->content; my $captchaimage = $mech->find_image( url_regex => qr/site.com\/captch +a/i ); $captchaimage = $captchaimage->url; print "Content-type: text/html\n\n"; print "Enter the letters to continue!\n" . "<form>\n". "<img src='" . $captchaimage . "'>\n". '</br>' . '<input type=text name=\'CAPTCHA\'>' . '</br>' . '<input type=submit value=\'Continue...\'>' . "\n</form>"; if (param()) { my $captcha = param('CAPTCHA'); chomp($captcha); $mech->submit_form( form_number => 1, fields => { 'CAPTCHA' => $captcha } ); $page = $mech->content; if ($page =~ /Thank you/) { print "Success!\n"; } else { print "Error!\n" } } } else { print "Error!\n"; }
What im trying to figure out is how can i get the script to wait for the input and then continue with the rest of the forms.

Replies are listed 'Best First'.
Re: CGI script and captcha input
by jhourcle (Prior) on Jun 02, 2007 at 03:21 UTC

    Although you're proving that you're a human by doing this, many CAPTCHAs exist because they're intentionally trying to make it more difficult to run large batches of jobs against their system. I can only assume you wouldn't be going through this effort for submitting a single job, and so I don't feel it would be a good idea to help you in your endeavor.

    I know that CAPTCHAs can be defeated in a number of ways, but their existance can be taken as sign that they would prefer you not to screw with their web form.

Re: CGI script and captcha input
by moritz (Cardinal) on Jun 02, 2007 at 10:46 UTC
    You can't let a CGI script wait for input.

    CGI works this way: you deliver pages to the user, and when they are displayed, the user choses to fill in forms and send them - or he doesn't.

    To implement what you're about to you need a way to introduce state, that can be achieved with cookies that store session informations.

    Take a look at CGI::Session::Tutorial, it should help you.