If I want to copy some code from Perlmonks, I find that I cannot cut and paste directly from the Web page without losing formatting. What I usually do is view source, copy the code to a file, and run a script I wrote that converts that HTML into proper Perl code. What I would prefer, however, is to have a "Download this code" link after code snippets that are properly posted with <CODE></CODE> tags.

Since this feature is not available (or if it is, I'm not aware of it), I thought it would make a nice project to write a script that would do this for me. I don't know much about proxy servers or Web automation, so this is a learning experience for me.

The following is the first stab at the code (kind of a "proof of concept"). All this code is supposed to do is display a Web page with a (currently) non-functional download link after each CODE posting. Also, all HREF links are pointed back to this code. The problem lies in the regex and the while loop that it is in. When I run the code, it simply hangs. While running it through a debugger, it seems to identify matches in a random, non-sequential order, thus not permitting the while loop to end.

#!/usr/bin/perl -w use strict; use CGI; use LWP::Simple; my $query = new CGI; my $basename = 'http://www.perlmonks.org/'; my $script = 'http://www.someserver.com/path/to/script.cgi'; # I track the actual URL as a hidden field in the HTML my $url = defined $query->param('url') ? $query->param('url') : $ +basename; # Default to $basename if no $url exists my $content = get (defined $url ? $url : $basename); # Add a hidden field with actual URL after <BODY> tag $content =~ s!(<BODY[^>]*>)!$1<INPUT TYPE="hidden" NAME="basename" VAL +UE="$url">!; # Have absolute paths go through this script $content =~ s!href="$basename!href="$script?url=$basename!gi; # Have relative paths go through this script $content =~ s!href\s*=\s*"/!href="$script?url=$basename!gi; # In the following regex, note the following: # Code tags are translated as # <PRE><TT><font size="-1">...</font></TT></PRE> # # <font size=...> and </font> are optional. This is turned off if w +e use "Large font code" # Quotes around -1 in the font tag are optional. They don't always +exist. # I discovered that in examining source for "Death to Dot Star!" my $code_regex = '<PRE><TT>(?:<font size="?-1"?>)?([^<]+)(?:</font>)?< +/TT></PRE>'; # These will be used to create the download link my $href1 = '<P><A HREF="' . $script . '?process=download&code='; my $href2 = '&url=' . $url . '">Download this code</A><P>'; my $i = 0; while ($content =~ m!($code_regex)!go) { my $match = $1; $content =~ s!$match!$match$href1$i$href2!; $i++; } print $query->header; print $content;
I know this is probably something ridiculously simple that I have missed, but I am pulling my hair out over this. Any help would be appreciated.

Cheers,
Ovid

Incidentally, some of the regexes and code above work only because of the layout of Perlmonks. This should not be viewed as any sort of general purpose script.


In reply to Perlmonks Code Proxy by Ovid

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.