in reply to Re^6: collect data from web pages and insert into mysql
in thread collect data from web pages and insert into mysql
What that sub (get_next_page) does is to check if there is a link to a next page. If there is it returns the page number and that is the page that is processed next. If there isn't a page number it returns undef and that exits you out of the
loop. With hindsight I should have called the sub get_next_page_number because that is what it is doing (it's not loading the page).while ($page){
The sub (get_sids) returns a list of all the sids. I reckon it would be simplest to do that and then decided which ones you want. grep might help with that. A tab delimited record sounds as thought it would do fine.
By the way, there are, in this case, three calls to the website. So you have to give it a moment to finish.
Let us know how you get on.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^8: collect data from web pages and insert into mysql
by SteinerKD (Acolyte) on Aug 02, 2010 at 14:42 UTC | |
by wfsp (Abbot) on Aug 02, 2010 at 15:25 UTC | |
by SteinerKD (Acolyte) on Aug 03, 2010 at 15:13 UTC | |
by wfsp (Abbot) on Aug 03, 2010 at 15:36 UTC |