If you know that the incoming text is html data, then there is probably a good way to us HTML::TokeParser::Simple so that you can locate just the pieces in the data that represent usable URL's that happen to be part of the visible text of the page. This node shows an example of how it's used for a similar sort of editing task.

Apart from that, the first parenthesized portion looks a bit odd, and the basic problem is that it doesn't really guard against hitting on a URL that happens to be inside of (i.e. an attribute of) some other tag. Something like the following might be an improvement (but HTML::TokeParser, or TokeParser::Simple, is still the preferred approach):

s{(>[^<]*?)(http://([.\w/]+))}{$1<a href=$2>$3</a>}gi;
Note the use of curly braces to bound the left and right sides of the expression -- so we don't have to backslash-escape all the slashes in the pattern content (you forgot to add the backslash for the </a> part in your code, so it should have caused a syntax error).

In this version, the first part assumes that once you see a close angle bracket, you're not inside any sort of tag, so look for zero or more characters that are not an open bracket, followed by a URL.

(update: fixed a couple typos in the explanation.)


In reply to Re: Help with regs by graff
in thread Help with regs by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.