search.cpan.org is an excellent place to look for things such as this.
I found
Apache::ReverseProxy there, which seems like it might do what you need.
It requires mod_perl though -- since you mention you can't install mod_proxy I don't know if installing mod_perl is an option for you.
- Matt Riffle | [reply] |
I think you're confusing "repackaging content" (as you seem to want) with "providing HTTP proxy services" (as mod_proxy does it). The latter requires a change
to the behavior of the client, which in knowing that it wants site A, still asks site B to provide it.
The "repackaging content" strategy is a difficult problem, because you have to
rewrite all the URLs of the passed-through content, in whatever form they
appear. Otherwise, the browser will end up fetching some stuff directly,
possibly confusing everything. For example, URLs in A-HREF elements obviously
need rewriting, but did you also consider the Location header for redirects, or
cookie domains, or image maps, or even the URLs constructed by Javascript or
Java?
It's a difficult problem. I hope you gain enough to recoup the investment
in figuring out how to do it. I hope you're also considering the ethical, moral,
and legal issues of branding someone else's content as your own.
For a simple start, handling only the A-HREF and Location rewrites,
see my column on a poor-man's CGI "proxy".
-- Randal L. Schwartz, Perl hacker | [reply] |
The Personal Open Directory script does a very similar job - it 'reads' the pages from dmoz.org, re-writes the URLs, re-brands it as necessary and allows sites such as my own site to offer the content without having to store several hundred megabytes of data. The code, while a little spaghettified, could be used as a good example of what you want to achieve. | [reply] |