in reply to scraping from HTTP page to MySql table

In order to know when to re-check the site for changes, you'ld rather ask its webmaster the hours when she renews the site. You could even suggest her to publish the changes at a newsfeed site (Sourceforge has it) like syndic8 *.

Then to get the notice of those news(changes) you should extract an XML file called RSS or RDF that specifies what articles have changed, or simple that you should re-check the site.

There are also PM to extract the RSS info from those files and even download them at a specified frequency:

see RSS at CPAN**.

(*) http://www.syndic8.com/

(**)http://search.cpan.org/search?mode=dist&query=RSS

{\('v')/}
_`(___)' __________________________
  • Comment on Re: scraping from HTTP page to MySql table

Replies are listed 'Best First'.
Re: ask the site's webmaster
by Agyeya (Hermit) on May 04, 2004 at 04:49 UTC
    Hi

    the site that i wish to be monitoring is a dynamic site. It may have details that are subject to random change. E.g consider the seat status in a train or bus. or even consider the appointment list of a doctor. Now on the site the list will be in the form of an excel table. Having fields, Patient ID, Appointment type, Appointment date, appointment time.

    Now suppose that a patient wants an appointment. so instead of putting him at the end of the queue, we can check the appointment list for any random cancellations, at put the patient in that slot.(this is just an example, as obviously the next patient in the queue shoukd be advanced). But considering how people have divded their own time in slots. the free time of the patient shuld match that of the vacancy in the appointment list.