in reply to Re^2: 3 capture multi line regex
in thread 3 capture multi line regex

i modified the regex a bit (i found a bug there), so two lines you mentioned become
href\s*=\s*[\"\'] [^\"\']+ [\"\']\s*> #first href (not c +aptured) \s*([^<>]+?)\s* #text inside first <a></a> +(captured)
At first line, i find a href= string followed by quotes (single or doble — ["']) containing string free of quoting symbols (i used a negated character class: [^"'] means NOT ["']).
At the next line i simply find a text without tags within. If you think there will be another tags within your link, it would be better to use
\s*(.+?)\s* # non-greedy capturing of everything till the +next </a>
instead.

Replies are listed 'Best First'.
Re^4: 3 capture multi line regex
by Anonymous Monk on Jun 30, 2006 at 20:34 UTC
    Hi.

    Your regex matched fine the first time but I need to put all occurences into an array. I can't get the array to hold anything now

    push (@results, "$1::$2::$3"), $result_content =~ m/$regex/;
    I tried adding /g to the end but it doesn't contain anything at all. I tried adding /g to the regex itself but it errors out.

    What am I doign wrong?

      your code does a very strange thing: you AT FIRST put a string "$1::$2::$3" into array and then perform search!

      Use a cycle :)

      push (@results, "$1::$2::$3") while $result_content =~ /$regex/g;
      or a cleaner but IMO more ugly code:
      while ($result_content =~ /$regex/g) { push (@results, "$1::$2::$3"); }
        I was told recently that doing a while on the var with the HTML code can create endless loops and it's a bad thing though.