Grabbing information form web pages...

kiat has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,

I'm using LWP::Simple to grab stuff from webpages and then have a script to parse the contents to customise the output. A link to the source of the parsed information is displayed on the html page.

Is this sort of things commonly done? Are there any legal issues involved?

My personal view is that if the contents are private or available based on subscription, the site owner would make it clear and would make it impossible for anybody to access the privileged information. Otherwise, it's free no matter how the information is accessed (with the exception of illegally hacking into the site.

What are your thoughts and views on the matter?

I look forward to hearing from you.

Thanks in anticipation :)

I wasn't aware of the legal implications. Will rule out such an option. Thanks for advice :)

Comment on Grabbing information form web pages...

Replies are listed 'Best First'.
Re: Grabbing information form web pages... by davido (Cardinal) on Feb 11, 2004 at 05:51 UTC
Just because you can check the Camel book out at the local public library doesn't mean that it's legal to photocopy it cover-to-cover and begin selling copies for $5.50 in computer expos. Violation of copyright laws are violations whether they're easy to accomplish or hard. Website policies may, in many cases, be easily violated. But the ease in which it is done doesn't mean that the policy is without legal basis, nor that your violation is really just the harmless act of a legitimate opportunist. If I leave my car door unlocked you might open it while I'm not looking and help yourself to a few of my CD's. Is that any less of a theft than if you had to pop the lock open with a slim-jim to get those CD's? Play it safe by sticking to the site guidelines, and when in doubt, negotiate a deal with the business up front rather than in the court-room after the fact. Dave	[reply]
Re: Grabbing information form web pages... by waswas-fng (Curate) on Feb 11, 2004 at 05:42 UTC
Obviously, depending what you do with the information and what the originating sites use policy is it may be legal (for instance google cache). You will be diving into shark infested waters when you start charging or building the scrapes into some sort of product. And always, even if you are doing something that may be perfectly legal, if someone sues you for it, do you have the resources (time and money) to fight the battle? Better to invest in a lawyer now and get reasonable advice than try to gather it from perlmonks. =) -Waswas	[reply]
Re: Grabbing information form web pages... by kvale (Monsignor) on Feb 11, 2004 at 05:56 UTC
Many (most?) pictures and pieces clip art are copyrighted, which means that unless the author says otherwise, you need permission to use these beyond "fair use". Text may be copyrighted as well. -Mark	[reply]