in reply to Re: regex not matching how I want it to :(
in thread regex not matching how I want it to :(

I spent so many hours on this single issue and so many hours total with the perl today that no, I am not sure I am running the code I posted anymore. I have a string with multiple "a href" and I want to replace each one of them with different urls. I have a feeling like the while loop is causing some problem here
  • Comment on Re^2: regex not matching how I want it to :(

Replies are listed 'Best First'.
Re^3: regex not matching how I want it to :(
by Corion (Patriarch) on Oct 18, 2018 at 19:56 UTC

    Maybe the following program helps you debug the regex:

    #!perl -w use strict; # use Regexp::Debugger; for my $line (<DATA>) { while ( $line=~/<a href=\"(.*?)\.htm\">/ig ) { print "$1\n"; }; } __DATA__ <a href="test1.htm"> test1</a><br> <a href="test2.htm"> test2</a><br>< +a href="test3.htm"> test3</a><br>

    For me this outputs:

    test1 test2 test3
Re^3: regex not matching how I want it to :(
by glwa (Acolyte) on Oct 18, 2018 at 20:10 UTC

    okay, once again sorry I am really tired, I stripped my code with all unnecessary stuff and this is what I come up with, the result is very strange for me

    $line='<p><a href="test1.htm"> test1</a><br> <a href="test2.htm"> test +2</a><br> <a href="test3.htm"> test3</a><br> <a href="test4.htm"> tes +t4</a><br>'; while ( $line=~/<a href=\"(.*?)\.htm\">/ig ) { $tmp=$1; print "LINE: $line\n"; print "TMP: $tmp KK\n\n"; $line=~s/<a href=\"$tmp.htm\">/<a href=\"\/xxx.html\">/i; }

      on the other hand Regexp::Debugger - amazing tool, thank you. It shows what the problem and it is what I suspected, but I have no idea why is it matching this way and I have no idea how to fix this. As you see in my code I have a s/ replace and after first replace the matching is weird

        so on the second pass of while the regex is more or less like this:

        $line='<p><a href="xxx.html"> test1</a><br> <a href="test2.htm"> test2 +</a><br> <a href="test3.htm"> test3</a><br> <a href="test4.htm"> test +4</a><br>'; $line=~/<a href=\"(.*?)\.htm\">/ig; $tmp=$1; print "LINE: $line\n"; print "TMP: $tmp KK\n\n";

        why is it matching: xxx.html"> test1
        <a href="test2

        and not "test2"? how do I make it match test2, thank you