Hey, I'm sorry--I'm a super-nube. I'm trying to write a scraper for some data I'd like to include in my dataset for an economics project. To do this I need to change my IP address every once in a while (which I'm not really so sure how to do). So far, when I run the code it reports "Can't locate Net/IP.pm in @INC (@INC contains: C:/Perl64/site/lib C:/Perl64/lib.) at...
The following is my code.
########################################################## #use WWW::Mechanize; #CRAIGS use LWP::Simple; #use this until "mechanize" works properly use Net::IP; print "Can you see this"; #$dir = "J:\Halibalu"; #put your directory here $dir = "C:\\workspaceP"; $out = "$dir\\output.csv"; $site = "http://zipinfo.com/cgi-local/zipsrch.exe?cnty=cnty&ll=ll&zip= +"; #put your web site here $zip_in = "$dir\\zip.csv"; #my $mech = WWW::Mechanize->new(); #CRAIGS #$mech->agent( 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_4; en-u +s) AppleWebKit/533.17.8 (KHTML, like Gecko) Version/5.0.1 Safari/533. +17.8' ); #CRAIGS open OUT, ">$out"; close OUT; open IN, "<$zip_in"; my $a = 10; my $b = 0; my $c = 0; my $d = 0; $ip = new Net::IP ('$a.$b.$c.$d'); my $count = 1; foreach $Z (<IN>) { if ($count<30) { chomp($Z); #MINE print "$Z\n"; #$mech->get($site); #CRAIGS my $page = get("$site${Z}&Go=Go"); #if($dist = $mech->submit_form(form_number=>1, fields=>{'field nam +e'=>$Z})) #search for the forms and figure out which one you need, th +en find the names of the fields if ($page =~ /Longitude<BR>(.*?)<\/font><\/td><\/tr><\/table>/s) { my $info = $1; $info =~ s/<td align=center>/,\s/g; $info =~ s/(West)//g; $info =~ s/<.*?>//g; print "${info}\n"; #if ($dist->decoded_content() =~ /find information here/s) { open OUT, ">>$out"; #print OUT "$Z, $1\n"; print OUT "$info\n"; #MINE close OUT; #} } $count++; else { #sleep(60*60*24); #sleep timer $count=1; if ($d<255){ $d++; } else{ $d=1; $c++; } $ip = new Net::IP ('$a.$b.$c.$d'); } } #Take everything in between "Longitude<BR>" and "</font></td></tr></ta +ble>" (these are verified unique) #make each "<td align=center>" into a comma (",") #Scrap "(West)", "</th>", "</tr>", "<tr>", "</font>" #########################################################
I'd be really grateful to anyone who can lend their insight here.
~AndrewIn reply to Rotating IP Addresses for Scraping by jandrewc
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |