Joost is right on this. I did something similar a while back grabbing newspaper headlines and LWP::Simple did the trick for me. Of course at the time I didn't know about HTML::TokeParser and would have made my job a whole lot easier.
You will want to save a copy of the source for a few days to make sure you're that the information you're looking for is in the same place every time. What you're going to want to look for is HTML comments. Hopefully the page you're scraping is going to have those around what you want. Then it's just a simple matter of reading until you get to the point you want to parse, parse it, and you're done.
In addition, if you look here, this node contains a small program I wrote using HTML::TokeParser so you can see what you're going to get as output using that module. That may help you if you go that direction.
Hope that helps!
There is no emoticon for what I'm feeling now.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.