ScottJohn has asked for the wisdom of the Perl Monks concerning the following question:

Hello Perl Monks,

I was trying to scrap data from the following website: http://web1.ncaa.org/stats/StatsSrv/careersearch and am getting the following message on 2 of 3 selection fields: "Input 'field x' not found..."

I am confused why it is apparently working for 1 field, but not the other 2. (I didn't paste the page source data since it was probably too large.) The site does navigate using Javascript. I hear that Java-based can be difficult to scrape using Perl. Should I try some other method?

My code is below:

use strict; use warnings; use WWW::Mechanize; my $mech = WWW::Mechanize->new(); my $outfile = "testNCAAdata.txt"; open(OUTFILE, ">$outfile"); my $url = "http://web1.ncaa.org/stats/StatsSrv/careersearch"; $mech->get($url); $mech->select('searchOrg','328'); $mech->select('academicYear','2009'); $mech->select('searchSport','MBB'); $mech->click(); $mech->click('submit',[0,1]); $mech->get($url); my $output_page = $mech->content(); print OUTFILE "$output_page"; close(OUTFILE);

I really appreciate any help you guys can offer. Thanks

Replies are listed 'Best First'.
Re: WWW::Mechanize "Input not found"
by Anonymous Monk on Jan 26, 2012 at 02:21 UTC

    I was trying to scrap data from the following website

    Um, scrap means throw away, recycle, chop into little pieces, melt down

    The site does navigate using Javascript. I hear that Java-based can be difficult to scrape using Perl. Should I try some other method?

    For the solution to every scraping problem, see Re^5: can't get WWW::Mechanize to sign in on JustAnswer or Web Testing with HTTP::Recorder or WWW::Mechanize::Firefox

     mech-dump http://web1.ncaa.org/stats/StatsSrv/careersearch will show you the noscript version of html, and indeed, one of the fields you try to populate doesn't exist in the html only browser