Want to be able to pull URLs out of a string and want to know if anyone knows of a good regex to do it?
So if you had a string with:
$string = 'This is a string with a URL in it: http://www.cpan.com. A +nother perl site, http://www.perlmonks.org, has lots of code snippets +.';
You'd be able to do a substitution over them to turn them into linked URLs. Thanks

Replies are listed 'Best First'.
RE: Regex to find URLs in a string
by merlyn (Sage) on Oct 25, 2000 at 01:24 UTC
RE: Regex to find URLs in a string
by FouRPlaY (Monk) on Oct 25, 2000 at 03:41 UTC
    Here's a sub I wrote (it works) to change URLs to include the tags to be outputed to HTML. You might be able to fool around with it.
    sub urlcheck($newline) { $newline =~ s/http\:\/\//\<a href \= \"http\:\/\//ig; if (substr ($newline, $#newline - 1, 1) eq ".") { $x = substr ($newline, /\G/, $#newline - 1); } else { $x = substr ($newline, /\G/); } $page = $x; $page =~ s/(http\:\/\/)|(\<\/a\>)|\=|\"|(href)|(\<a)|(\w+[ ]+)//g; $newline =~ s/$x/$x\"\>$page\<\/a\>/; return $newline; }
    BTW, it checks to see if the URL is at the end of a sentence and therefore might have an extra period.
        Simple, I didn't know it existed! Also, I didn't know about Net::Finger, and I wrote a script to do that; also HTML::FromText, and I spent many a week programing a script to do that too!

      i don't think that $#newline means what you think it means. that variable indicates the index of the last element of @newline (if it is non-empty) and has nothing to do with the scalar variable $newline.

      try the following instead:

      if (substr($newline, -1) eq '.')
        Thanks for the correction. I've learn most of my PERL by guessing. I figured if $# worked for arrays, it might work for scalars.

        Your suggestions is also, I find, a lot clearer and more percise. Thanks.