in reply to Re: Loading A Site Into An Array
in thread Loading A Site Into An Array

To explain my situation better, I'll give some more details. I would like to make a script that functions similar to anonymizer.com and loads a whole website into an @array. I then plan to replace all <a href="www.yahoo.com"> tags with <a href="http://www..com/route.cgi?www.yahoo.com"> So that they go through the script first. The website that it loads could be any website on the internet. Any help would be appreciated, and the help already given is greatly appreciated as well.

Replies are listed 'Best First'.
RE: RE: Re: Loading A Site Into An Array
by btrott (Parson) on Mar 10, 2000 at 03:20 UTC
    You might want to take a look at Randal Schwartz's anonymous proxy server (created for one of his WebTechniques columns).

    It doesn't try to do what you're doing (replacing links to go through a CGI script), but rather uses your browser's built-in ability to use a proxy server.

    Anyway, though, for what you asked about, take a look at HTML::LinkExtor (a subclass of HTML::Parser).

    perldoc HTML::LinkExtor perldoc LWP::UserAgent
    You can use LWP to fetch the web page, then extract the links, then replace each link by a modified version of itself that routes the user through your program.