Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

web scrape using API and the regular web scrape

by trample666 (Initiate)
on Mar 02, 2010 at 13:23 UTC ( [id://826156]=perlquestion: print w/replies, xml ) Need Help??

trample666 has asked for the wisdom of the Perl Monks concerning the following question:

Hi Im a newbie...

plzz forgive my ignorance..

I would like to know the advantages and disadvantages between web scrape using API and the regular web scrape...

Replies are listed 'Best First'.
Re: web scrape using API and the regular web scrape
by Corion (Patriarch) on Mar 02, 2010 at 13:33 UTC

    If the service you want to scrape provides an API, using that API has the benefit of giving you clean data and shielding you from a redesign of the website or a change in language or layout.

    Using an API often also means that you need to register with the website and also often means that there are limits as to how often you may use the service.

Re: web scrape using API and the regular web scrape
by derby (Abbot) on Mar 02, 2010 at 13:42 UTC

    ++Corion's answer but I'm not sure that's what the OP was talking about. If you're using the API of a website, then I would not call it a 'scrape.' trample666, where you asking about a site's API or using a module such as WWW::Mechanize? The way you phrased the question leaves a lot open to interpretation

    -derby

      Difference between scraping a normal website using, for example API's provided by google and scraping a website without using API's (maybe using mech, post or get like u said).

      Also should the website that I am scraping be API enabled(I mean should it also provide API services) when I use API's to scrape???

        Can you translate that into english please?
Re: web scrape using API and the regular web scrape
by amir_e_a (Hermit) on Mar 02, 2010 at 16:44 UTC

    Writing a scraping program is harder in the first place. Besides, if you use the scraping method, then it may stop working when the design of the website changes, unless be pure luck it will keep working.

    So if you have the option to use an API, do use it, unless you actually see a problem with it.

    Sometimes the website doesn't provide an API, though, and then you don't have a choice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://826156]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (6)
As of 2024-03-28 12:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found