in reply to How To Store Data Structures
Should I create Scraper::Library::Yahoo and export a single sub which scrapes Yahoo, lather, rinse, repeat?Yes, that. But you should not export the function; instead, it should be called as a (class) method. I.e.
Of course, all the generic bits involved in scraping could (should?) be in the base class.use Scraper::Library::Yahoo; Scraper::Library::Yahoo->scrape();
An even better way is to let these "libraries" be strategies of the Scraper class. All the real work is done in method(s) of the main class (Scraper), but it delegates to one of the library classes for certain functions. And that could be something as simple as a function that returns some config data.
Of course, depending on your architecture (I don't know how your Scraper really works), it might make as much sense to have individual bits of configuration returned by discrete methods:package Scraper; sub new { my( $pkg, $strategy_class ) = @_; bless { strategy_class => $strategy_class, }, $pkg } sub scrape { my $self = shift; my $config_data = $self->{'strategy_class'}->config_data(); # ... proceed to scrape using this config data } package Scraper::Yahoo; # as a strategy of Scraper, this class only needs to implement those m +ethods # a Scraper will call. sub config_data { return( starting_page => 'www.yahoo.com/foo/', some_regex => qr/foo(.*?)bar/, html_tree_spec => [ '_tag', 'div', 'id', 'headlines' ], ); } . . . package main; # pass the strategy class name to the constructor: my $yahoo_scraper = new Scraper 'Scraper::Yahoo';
That's how I usually do it. I'm a big fan of strategy classes. :-)package Scraper; sub scrape { my $self = shift; my $starting_page = $self->{'strategy_class'}->starting_page(); my $some_regex = $self->{'strategy_class'}->some_regex(); my $html_tree_spec = $self->{'strategy_class'}->html_tree_spec(); # ... proceed to scrape using this config data } package Scraper::Yahoo; sub starting_page { 'www.yahoo.com/foo/' } sub some_regex { qr/foo(.*?)bar/ } sub html_tree_spec { [ '_tag', 'div', 'id', 'headlines' ] }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: How To Store Data Structures
by Cody Pendant (Prior) on Jul 20, 2005 at 05:08 UTC | |
|
Re^2: How To Store Data Structures
by Cody Pendant (Prior) on Jul 20, 2005 at 07:24 UTC | |
by jdporter (Paladin) on Jul 20, 2005 at 13:44 UTC |