You may wish to check out the HTML::FromText module. It will, amongst other things, automatically convert URLs to hyperlinks. I've never worked with .plan files, so I can't say for certain whether this is an appropriate solution, but I suspect that it's a good place to start.
Also, if you wish to do it by hand, switching to a different delimeter on your regexes will help you avoid backslashitis. Further, if your URLs are not broken across lines (i.e., if they don't have embedded newline) or have spaces, your could try the following (untested) regex as a starting point for conversion:
$newline =~ s#(http://[^.]+\.[^.]+\S+)#<a href="$1">$1</a>#gi;
The above regex assumes that, at minimum, you will have two groups to characters separated by a period after the http:// portion. The negated character classes should actually be replaced by classes that state allowable characters (and if you really want to be anal, I recall that the first allowable character in a domain is different from other allowable characters, but sometimes I get into regex overkill).
Cheers,
Ovid
Join the Perlmonks Setiathome Group or just go the the link and check out our stats. | [reply] [d/l] |
If you’re using such a through regex that checks for dots and allowable characters, you may wish to ditch the http:// completely. People are more likely to list websites in their .plan files without it (for example, I visit perlmonks.org and not I visit http://www.perlmonks.org)
Personally I’d feel safe putting anchor tags around anything that looks like xxx.xxx, although you could also include a list of allowable Top Level Domains, something like
@TLDs = ("com","net", "org", "edu","us","nl","de","it","se","ch","uk","ca","hr","ae","br","jp","be","us","au","ie","ar","fi","mil","gov","sg","es","mx","no","pt","dk","il","ru","nz","th","pl","id","cy","in","kw","at","za","cn","fr","is","ro","kr","gr","co","ph","bo","hu","cr","pe","cl","tr","arpa","tw","eg","ee","ge","ua","om","ec","hk","ve","ag","cz","ni","to","nu","sm","ni","lt","yu","bg","ba","do","qa","ck","mt","bf","lu","su","bh");
| [reply] [d/l] |
Isn't this a little dangerous? Any time new TLD's are
added you will need to go and change the list, plus I
cannot see .cx, home of a bunch of free software projects
in this list.
http:// or at least www(\..+)+\.\w+ seem the safest matches
| [reply] |
You could use [CGI]; in shell mode to print out nice HTML as well as help to clean up your code. Another thing you might want to consider is expanding img tags if that's allowed the .plan files (I assume every user has an editable .plan where this is supposed to parse it to HTML- but in that case, why not require them to use HTML and use the HTML::Parser subclasses to restrict what you wish to restrict?)
AgentM Systems nor Nasca Enterprises nor
Bone::Easy nor Macperl is responsible for the
comments made by
AgentM. Remember, you can build any logical system with NOR.
| [reply] [d/l] |