ssnewbie has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have written only basic perl code before. Can somebody tell me how to extract information from webpage and save them on a database? Where should I start? Is there a good reading material? Should I install something apart from PERL? Thanks,
  • Comment on How can I extract all the information on a webpage to my database?

Replies are listed 'Best First'.
Re: How can I extract all the information on a webpage to my database?
by Fletch (Bishop) on May 09, 2005 at 20:57 UTC

    Perl and LWP (ISBN 0596001789) covers everything you'll need to know to do web scraping.

    Update: Actually, I don't think it covers WWW::Mechanize, so amend that to "covers almost everything".

Re: How can I extract all the information on a webpage to my database?
by mda2 (Hermit) on May 09, 2005 at 21:04 UTC
Re: How can I extract all the information on a webpage to my database?
by davidrw (Prior) on May 09, 2005 at 21:00 UTC
    LWP is probably what you'll want at the core of your scraping. LWP::Simple and WWW::Mechanize are based on LWP and are both very powerful. Depending on what you want to Scrape there may be a module for it already.
Re: How can I extract all the information on a webpage to my database?
by Joost (Canon) on May 09, 2005 at 21:16 UTC
Re: How can I extract all the information on a webpage to my database?
by devnul (Monk) on May 09, 2005 at 23:31 UTC
    Just wanted to add an endorsement for WWW::Mechanize.... It is extremely flexible, IMHO. You may want to pare that with HTML::TreeBuild, HTML::TableExtract for even more fun.

    - dEvNuL