perl.j has asked for the wisdom of the Perl Monks concerning the following question:

Wise Monks,

I would like to read from a webpage, and put the information into a Excel or Open Office (preferably Open Office) spreadsheet. The only problem is, I don't know how to do this.

I do know how to read from a .txt file and put it into a spreadsheet.

So, my question is, how do I read from a webpage using Perl? Is there a module or algorithm that is commonly used to do this? Thank You in advance for the help.

--perl.j

Replies are listed 'Best First'.
Re: Webpage to Excel
by Corion (Patriarch) on Aug 06, 2011 at 15:14 UTC
      The reason for posting this node was not to get help with semi-written code, but to learn a technique that may help me do other things in Perl. I did not know Open Office/Excel could do this without Perl (thanks for that info by the way), but I would still like to learn to do this with Perl for future reference. Thank You for the links. You were (and always are :) a great help.
      --perl.j

        The simplest technique is to fetch the page by URL with LWP::Simple, and extract the table with HTML::TableExtract. If the site requires more elaborate navigation in order to produce the table, you might need WWW::Mechanize. But Mechanize may be overkill if all you need is to fetch a document at a specific URL.

        Once you've fetched the document and parsed out the table, I assume you know how to plug it in to Excel, as you mentioned you already know how to insert text into an Excel document.

        Rather than provide an example of using LWP::Simple and HTML::TableExtract myself, I'll refer you to the synopsis section of their documentation. Both of them have good documentation and easy to follow SYNOPSIS sections in their POD. You can do a lot better by reading the POD than by looking at my cobbled-together example.


        Dave

Re: Webpage to Excel
by Anonymous Monk on Aug 06, 2011 at 14:48 UTC