Automatically hyperlinking text fails with newlines

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, can anyone help with this little problem I have?

I'm trying to automatically hyperlink url's from some plain text and I have a simple (but not perfect) regular expression that does this.

However when I try and use it with a url that wraps onto a new line, the url itself contains a new line and stops the browser going to the right page.

Allow me to demonstrate with this bit of code:

#!/usr/bin/perl
use strict;

my $output = "";

open(FILE, "file.txt") || die "$!";
while(<FILE>)
{
  $output .= $_;
}
close FILE;

$output =~ s#(http://[^\!\"\Ł\$\^\*\(\)\{\}\[\]\;\:\'\@\,\<\> ]+)#\<a 
+href="$1" target="_blank"\>$1\</a\>#gis;

print $output;
[download]

If file.txt has the following:

The quick brown fox at http://fish.org would like to go home and http:
+//see
through.com/pants/foo/bar what there is to eat
[download]

then my output is:

The quick brown fox at <a href="http://fish.org" target="_blank">http:
+//fish.org</a> would like to go home and <a href="http://see
through.com" target="_blank">http://see
through.com</a> what there is to eat
[download]

So my question is, is there any way i can remove that \n that is in the middle of the a href?

Many thanks!

Comment on Automatically hyperlinking text fails with newlines Select or Download Code

Replies are listed 'Best First'.
Re: Automatically hyperlinking text fails with newlines by suaveant (Parson) on Sep 20, 2001 at 21:05 UTC
try... `while(<FILE>) { chomp; $output .= $_; }` [download] which will remove newlines from $_ - Ant - Some of my best work - Fish Dinner	[reply] [d/l]
Re: Re: Automatically hyperlinking text fails with newlines by Anonymous Monk on Sep 20, 2001 at 21:14 UTC
Hi, thanks for this. However I only want to remove the \n from inside the hyperlink whilst still preserving the newlines elsewhere. In other words the output should be: `The quick brown fox at <a href="http://fish.org" target="_blank">http: +//fish.org</a> would like to go home and <a href="http://seethrough.c +om" target="_blank">http://see through.com</a> what there is to eat` [download] (http://www.seethrough.com is preserved within the href tag, but still has the newline for the display)	[reply] [d/l]
Re: Re: Re: Automatically hyperlinking text fails with newlines by suaveant (Parson) on Sep 20, 2001 at 21:33 UTC
well then... you could pass $1 as a function, since you are matching the hyperlink with the \n, similar to... s/regex/make_link($1)/eg; sub make_link { my $url = $_[0]; $url =~ s/\s+//g; qq`<A HREF="$url">$url</A>`; } [download] - Ant - Some of my best work - Fish Dinner	[reply] [d/l]
Re: Re: Re: Automatically hyperlinking text fails with newlines by alien_life_form (Pilgrim) on Sep 21, 2001 at 12:04 UTC
Greetings. This being the case, I would (still) slurp the entire file, not replace \n, and put \n among the acceptable characters in the regexp (and I *think* you may have to use the /s modifier). I would then replace the \n still embedded in the href. Not perfect, because this: Ehi, check this out on http://cnn.com\n dude! would yield: http://cnn.comdude. In other words, if you want ignorable newlines within URIs, then URIS must always be separated by real whitespace from the surrounding text. Cheers, alf	[reply]
Re: Automatically hyperlinking text fails with newlines by alien_life_form (Pilgrim) on Sep 20, 2001 at 21:20 UTC
What about: ... #error checking and stuff left as an exercise. $/=undef; #slurp files. $lines=<FILE>; close(FILE); $lines =~ s/\n//g; #now do your thing. Note that this joins lines, so if you count on newlines as whitespace, you're out of luck with this. Cheers, alf	[reply]