Ovid has asked for the wisdom of the Perl Monks concerning the following question:
Since this feature is not available (or if it is, I'm not aware of it), I thought it would make a nice project to write a script that would do this for me. I don't know much about proxy servers or Web automation, so this is a learning experience for me.
The following is the first stab at the code (kind of a "proof of concept"). All this code is supposed to do is display a Web page with a (currently) non-functional download link after each CODE posting. Also, all HREF links are pointed back to this code. The problem lies in the regex and the while loop that it is in. When I run the code, it simply hangs. While running it through a debugger, it seems to identify matches in a random, non-sequential order, thus not permitting the while loop to end.
I know this is probably something ridiculously simple that I have missed, but I am pulling my hair out over this. Any help would be appreciated.#!/usr/bin/perl -w use strict; use CGI; use LWP::Simple; my $query = new CGI; my $basename = 'http://www.perlmonks.org/'; my $script = 'http://www.someserver.com/path/to/script.cgi'; # I track the actual URL as a hidden field in the HTML my $url = defined $query->param('url') ? $query->param('url') : $ +basename; # Default to $basename if no $url exists my $content = get (defined $url ? $url : $basename); # Add a hidden field with actual URL after <BODY> tag $content =~ s!(<BODY[^>]*>)!$1<INPUT TYPE="hidden" NAME="basename" VAL +UE="$url">!; # Have absolute paths go through this script $content =~ s!href="$basename!href="$script?url=$basename!gi; # Have relative paths go through this script $content =~ s!href\s*=\s*"/!href="$script?url=$basename!gi; # In the following regex, note the following: # Code tags are translated as # <PRE><TT><font size="-1">...</font></TT></PRE> # # <font size=...> and </font> are optional. This is turned off if w +e use "Large font code" # Quotes around -1 in the font tag are optional. They don't always +exist. # I discovered that in examining source for "Death to Dot Star!" my $code_regex = '<PRE><TT>(?:<font size="?-1"?>)?([^<]+)(?:</font>)?< +/TT></PRE>'; # These will be used to create the download link my $href1 = '<P><A HREF="' . $script . '?process=download&code='; my $href2 = '&url=' . $url . '">Download this code</A><P>'; my $i = 0; while ($content =~ m!($code_regex)!go) { my $match = $1; $content =~ s!$match!$match$href1$i$href2!; $i++; } print $query->header; print $content;
Cheers,
Ovid
Incidentally, some of the regexes and code above work only because of the layout of Perlmonks. This should not be viewed as any sort of general purpose script.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Perlmonks Code Proxy
by tye (Sage) on Aug 19, 2000 at 00:40 UTC | |
|
Re: Perlmonks Code Proxy
by Zebu (Novice) on Aug 19, 2000 at 04:23 UTC | |
by Ovid (Cardinal) on Aug 19, 2000 at 07:57 UTC | |
by jplindstrom (Monsignor) on Aug 19, 2000 at 18:52 UTC | |
|
Re: Perlmonks Code Proxy
by Boogman (Scribe) on Aug 19, 2000 at 00:40 UTC | |
by Ovid (Cardinal) on Aug 19, 2000 at 00:46 UTC | |
by Boogman (Scribe) on Aug 19, 2000 at 01:24 UTC | |
by tilly (Archbishop) on Aug 19, 2000 at 01:28 UTC | |
by Boogman (Scribe) on Aug 19, 2000 at 01:49 UTC | |
| |
|
RE: Perlmonks Code Proxy
by Anonymous Monk on Aug 19, 2000 at 01:39 UTC | |
|
Re: Perlmonks Code Proxy
by nate (Monk) on Aug 19, 2000 at 19:51 UTC | |
by merlyn (Sage) on Aug 19, 2000 at 19:58 UTC | |
by vroom (His Eminence) on Aug 19, 2000 at 22:06 UTC | |
by merlyn (Sage) on Aug 19, 2000 at 23:00 UTC |