in reply to Re^2: Crawling Relative Links from Webpages
in thread Crawling Relative Links from Webpages

There is only one hard-coded address in the code:

my $mech = WWW::Mechanize->new(); $mech->get("http://dspace.mit.edu/handle/1721.1/53720");

If you want to make that variable, maybe you want to pass the starting link from the command line? It will then be available via @ARGV:

my $mech = WWW::Mechanize->new(); warn "Fetching $ARGV[0]\n"; $mech->get($ARGV[0]);

Call it as

perl -w listanand.pl http://google.com

Replies are listed 'Best First'.
Re^4: Crawling Relative Links from Webpages
by listanand (Sexton) on May 08, 2010 at 15:32 UTC
    Ah yes of course. What was I even saying. I get it now.

    Thank you very much everyone. This has solved my problem !

    Although I still get a warning "Use of uninitialized value in string eq at crawler.pl line <line where I check for pdf mime type>". Makes me wonder...

    Andy

      I still get a warning "Use of uninitialized value in string eq at crawler.pl

      This line-

      no warnings "uninitialized";

      -isn't for show. :) A path that is "dir" -- like / -- will not have a mime type and various other paths will fail to be found too.

        Oh I see. I wasn't using no warnings "uninitialized" ;

        Thanks again mommy !