Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi ,
I want help on the following problem :

How can I read data from a webpage and store it in file.I have the following script code in perl :

#!/usr/bin/perl use LWP::Simple; $url = "http://www.myurl.com/"; $file = "/path/to/savefile.txt"; getstore($url, $file);

But What i want is if I have a web page like yahoo.com or rediff.com or any web page opened in web browser then I need to read the data of that opened page in browser .

Eg : 1.if i have yahoo.com page opened in browser then I need to read that page data and store it in file .
2.if i have rediff.com page opened in browser then I need to read that page data and store it in file .
3.If i have any web page opened then I need to read that page data and store it in file .
When i use ,

$url = "http://www.myurl.com/";
it is just like hard coding a particular link , which i don't want.

How can i achieve this .

Please let me know on the above problem .

Thankyou !

20030720 Edit by Corion: Added formatting

Replies are listed 'Best First'.
(jeffa) Re: Read web page Data
by jeffa (Bishop) on Jul 20, 2003 at 14:21 UTC
    You must be the one who posted Check for popups in webpage (originally titled "Popup killer"). The problem is that what you want is not going to be easy. You want a Perl script to monitor and intervene with Netscape. Well, while your Perl script can be changed, Netscape is a compiled binary. You could recompile Netscape with changes that you made to the source to feed the requested data to your Perl script. But that's a game i would rather not play. You have already stated that this is for Unix ... so OLE control seems right out (and that's another game i would rather not play).

    Now then, the one thing that you have not told is why you want to do this. Most people leave out the why either because they don't think it is relevant (it is always relevant) or because they want to retain confidentiality. If you are planning on writing some commercial "Popup killer", then good luck. But if you are simply trying to prevent those annoying popups from ... popping up, switch to a browser that gives you that control (i prefer Mozilla).

    UPDATE:
    Maybe Mozilla's XUL will be of help to you.

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    
Re: Read web page Data
by Fletch (Bishop) on Jul 20, 2003 at 11:09 UTC

    Erm,

    • pretty much every web browser on the planet provides some form of `save page as...' functionality (which would eliminate needing an external program)
    • those same browsers allow you to cut and paste the URL for the current page so just pass that as an argument to your program (perldoc perlvar, look for @ARGV)
    • not to mention things like wget or curl which already do this (and have things like progress indicators)
      Can u give me a few sample code to it .
Re: Read web page Data
by thinker (Parson) on Jul 20, 2003 at 08:26 UTC

    Hi AM,

    When you ask for yahoo.com, what you will get is the default page for that site, maybe index.html, or index.cgi, or whatever. It doesn't really matter what it is called, it is just an html page.

    In which case, simply change $file = "/path/to/savefile.txt" to $file = "/path/to/savefile.html"

    Of course, hardcoding the filename in there is of limited use, unless you want to rename the file after each run of the program

    I hope this is what you were asking :-)

    cheers

    thinker

      Hi Monks, I also used code given here and I used following code as well but I did not get response: use strict; use warnings; use LWP::Simple; my $url = "http://www.google.com"; getprint($url); Windows7/perl 5.16.3 Can anyone please help me out here to extract data from any webpage. Thanks.
      This is wht I want to ask : I want to code a perl script as follows . Open Netscape and type a url . This url can be anything like perlmonks.com or yahoo.com or any url . Now I want my perl script to read data for this url . Suppose I type the url like yahoo.com then the perl script must be able to read data of that page . But if I change the url to rediff.com or any other then it should read that particular web page data. That means as I keep changing the url in Netscape the perl script should continue to read data for that url page and save it somewhere . Normally to retrieve web page data I use : $url = "http://www.myurl.com/ But my problem comes when I change the url in Netscape I am unable to change the url for the $url. Can u suggest me how can I achieve it .
        Look at a combination solution of autoit and perl. autoit can read your freshly type url and call your perl script with that url. It might be a bit complicated but will certainly work out if you put in the effort.