Gurus,
I need your help -- I need to be able to parse a dynamic webpage. For example, if i got yahoo.com and search for "soup", I will get a bunch of results ... the url in the location bar of IE will change to something relevant to my search. Now I need to write code where I pass it this url (with the search criteria, etc) and the script will find all the urls on the results page and save them to file. I've been trying to get LWP and CGI... trying to get my feet wet with the following code (which should get a urls title):
#!/usr/bin/perl
use CGI;
use LWP::Simple;
use HTML::TokeParser;
$cgiobject=new CGI;
$cgiobject->use_named_parameters;
print $cgiobject->header;
print $cgiobject->start_html
(-title=>'Page Parser',
-bgcolor=>'white');
print $cgiobject->startform
(-method=>'get',
-action=>'parsepage.pl');
print "URL to Analyze:".$cgiobject->textfield
(-name=>'url',
-size=>'40');
print "<br>".$cgiobject->submit(-value=>'Analyze');
print $cgiobject->endform;
print "<hr>";
#retrieve web page
$fetchURL=$cgiobject->param("url");
unless ($fetchURL)
{$fetchURL="www.yahoo.com"}
$webPage=get($fetchURL);
print <<ENDHTML;
<center><h2>$fetchURL<br>$webpage<br>
has been sliced and diced,
thus revealing:</h2></center>
ENDHTML
&parse_title;
print $cgiobject->end_html;
sub parse_title{
#parse and output page title
$parser=HTML::TokeParser->new(shift||$webPage);
$parser->get_tag("title");
print "<p><h2>Page title</h2> ".
$parser->get_trimmed_text."</p>";
}
BUT it gives me this error ..... "Undefined subroutine CGI::use_named_parameters at parsepage.pl line 7". If I comment out line 7, the script doesnt do jack.
Any help/advice would be greatly apprecaited.
Thanks,
NeedPerlWisdomGuy
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.