I just looked at the source of the page you provided. You're quite lucky in this case, the webmaster seems to have dumped all relevant information into the body as well as into the meta tags.
If i where you, i'd look into something like
- WWW::Mechanize::GZip to fetch the files, "click" links and fill out forms.
- Regular expressions to get the information line-by-line from the META tags into a hash.
- think about perl functions like open, print, close to write a CSV file. You know,
- foreach field of the required fields write the properly quoted content and add a semicolon and don't forget the newline at the end.
I'm specifically vague, but following the list should give you a (simple, ugly) quickhack solution for your problem. It will break if the webmaster removes or changes the meta lines. But if this is a onetime job, it should work.
Of course, if you need something flexible, reliable that will work for some years, you should really take a look into real HTML parsers.
Don't use '#ff0000':
use Acme::AutoColor; my $redcolor = RED();
All colors subject to change without notice.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.