Looking for a good, simple, quick to get up and running, module to scrape some data from a table in a web page.
The url is basic, no authentication.
The table looks like:
| City 1 | Cloudy | -5°C |
| City 2 | Cloudy | -10°C |
| City 3 | Light Snow | 1°C |
| City 4 | Fog Depositing Ice | -11°C |
And I want to scrape off the city names, conditions, temperature. I can count on the columns always being in the same order.<table width="100%" border=1 cellspacing="1" cellpadding="1"> <TR valign="top" BGCOLOR=#FFFFFF> <td align="top"><a href='/forecast/city.html?1'>City 1</a></td><td now +rap align="top">Cloudy</td><td nowrap align="right">-5°C</td></tr +> <TR valign="top" BGCOLOR=#EEF5EE> <td align="top"><a href='/forecast/city.html?2'>City 2</a></td><td now +rap align="top">Cloudy</td><td nowrap align="right">-10°C</td></t +r> <TR valign="top" BGCOLOR=#FFFFFF> <td align="top"><a href='/forecast/city.html?3'>City 3</a></td><td now +rap align="top">Light Snow</td><td nowrap align="right">1°C</td>< +/tr> <TR valign="top" BGCOLOR=#EEF5EE> <td align="top"><a href='/forecast/city.html?4'>City 4</a></td><td now +rap align="top">Fog Depositing Ice</td><td nowrap align="right">-11&d +eg;C</td></tr> </table>
Not hard to write a custom parser, but if thee's a module out there ideally suited to this kind of thing, that'd be preferable.
Thanks.
In reply to Best module to scrape tabular data fram web pages? by punch_card_don
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |