in reply to Re: Perl spider
in thread Perl spider

Well i have done little research and found some interesting modules like Perl LWP and Mechanize. It looks really easy and simple. My question is actually how efficient is Perl in terms of memory and cpu utilization in compare to java for this particular task.

Replies are listed 'Best First'.
Re^3: Perl spider
by LanX (Saint) on Apr 21, 2011 at 21:51 UTC
    I have no big experience with spiders, but don't you think that the web will be the bootleneck?

    I expect Perl to parse HTML much faster than the server responses come in.

    Cheers Rolf

Re^3: Perl spider
by CountZero (Bishop) on Apr 22, 2011 at 19:26 UTC
    Perl has modules like Scrappy (All Powerful Web Harvester, Spider, Scraper fully automated), WWW::Crawler::Lite, WWW::Spyder or Gungho.

    No need to amuse yourself with the low-level stuff.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James