in reply to Re^2: How To Write A Scraper?
in thread How To Write A Scraper?

Oh, I see....I think.
At the moment, it sounds like you're describing a collection of site-specific parsers, like the Finance::Quote tree, perhaps?
--
jpg

Replies are listed 'Best First'.
Re^4: How To Write A Scraper?
by Cody Pendant (Prior) on Jul 04, 2005 at 00:46 UTC
    Aha, yes, that looks like the kind of thing.

    They have "Finance::Quote::Yahoo" and "Finance::Quote::Tdwaterhouse" and so on.

    Presumably there's some kind of upating mechanism which only updates the "Tdwaterhouse" part when they change their HTML?

    I will research further, thank you.



    ($_='kkvvttuu bbooppuuiiffss qqffssmm iibbddllffss')
    =~y~b-v~a-z~s; print
Re^4: How To Write A Scraper?
by Cody Pendant (Prior) on Jul 04, 2005 at 01:05 UTC
    That was the kind of thing I was thinking of, yes.

    A top-level scraper which loads, when required, a sub-scraper for a specific area. Although finance quotes are of course very much more specific in form than "interesting articles from online papers"...



    ($_='kkvvttuu bbooppuuiiffss qqffssmm iibbddllffss')
    =~y~b-v~a-z~s; print