in reply to Re: collect data from web pages and insert into mysql
in thread collect data from web pages and insert into mysql
Wfsp asked for more info so I grabbed some screenies to better explain what I need to do
First I need to export from a db a list of persona IDs (pid) to process, each paired with the the id of the last processed sortie (sid).
Script will now read first pid and sid and load the first sortie list page for the persona and store the content and keep doing so untill it finds and empty page or one containing the last processed sortie.
Image of sortie page.
Now I need to process the stored code and extract all sids found that are new and make a list of that for processing.
Next step, grab all the data I need from each sortie detail page in the list (URL constructed from the grabbed sids).
Image of sortie details page with the data I need. (Link fixed)
Store the data in a way that can be imported into a DB. When last sid is done we're also done with the persona so load next pid/lproc and start over (save the highest numbered sid to update the lastproc stat).
|
|---|