in reply to Sucking Data off a Web Page
jeffa mojotoad wrote a really great scraping module, HTML::TableExtract, which easily scrapes an HTML table into an array of arrays, which you can then convert to a csv file again, or stuff it into DBI directly. For example, the following code tries to extract all rows from "the one table" on the page:
my $te = HTML::TableExtract->new(); $te->parse($html); foreach $row ($te->rows) { print join(',', @$row), "\n"; }
The only problem there is with your table is, that it is not organized in columns but in rows, so you will have to flip the table.
Update: I realized that it was mojotoad, not jeffa who wrote HTML::TableExtract.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Sucking Data off a Web Page
by muba (Priest) on Oct 10, 2004 at 19:37 UTC | |
by iblech (Friar) on Oct 11, 2004 at 18:08 UTC |