OverlordQ has asked for the wisdom of the Perl Monks concerning the following question:

I have a script, which grabs a webpage, parses it and changes a few urls, then displays the result to the browser, all of it works, except one part of the parsing:
$text =~ s#src=\"(.*?)\"#src=\"$url$1\"#sig;
$url contains the orginal URL which was grabbed. Here-in lies the problem:
Assume $url is http://www.slashdot.org. If the URL in the src tag was:
http://www.slashdot.org/image.gif
it get's changed into:
http://www.slashdot.orghttp://www.slashdot.org/image.gif
but if the url was a relative url:
image.gif
becomes:
http://www.slashdot.org/image.gif
which works. So my question/problem, is there any way to get the regex to ignore absolute urls?

Replies are listed 'Best First'.
•Re: Help with Regex: (Absolute/Relative URLs)
by merlyn (Sage) on Aug 05, 2003 at 21:04 UTC
Re: Help with Regex: (Absolute/Relative URLs)
by Abigail-II (Bishop) on Aug 05, 2003 at 21:41 UTC
    $text =~ s#src="(?!\w+://)([^"]*)"#src="$url$1"#ig;

    There's no need to escape the quotes, and it's more efficient to change the .*? to [^"]*. Then you don't need the /s either.

    Abigail

      Ah, thank you. It's been ~2-3 years since the script has been written, so many reasons for doing it that way are probably forgotten.