I have the following precompiled regexp:

$test_regexp = qr/url="(http:\/\/downloads\.bbc\.co\.uk\/podcasts\/wor +ldservice\/globalnews\/(globalnews_${year}${mon}${mday}-\d{4}[a-z]\.m +p3))"/;

As you can see, I have 2 capture groups built into this match. One captures the complete URL, the other capture gets just the filename itself. The webpage I use this particular regexp on may contain multiple valid matches on the same line. I would like to capture both the complete URL and the filename for each one. If it matters (and I don't believe it does), the precompiled regexp is passed to a function that dumps the webpage to a file, opens the file with the TEMP_XML_FILE handle, and searches for the $test_regexp matches on each line. Right now I have this:

while (<TEMP_XML_FILE>) { if ((@complete_url, @filename) = ($_ =~ /$test_regexp/g)) { printf("found %d matches\n", scalar(@filename)); <>; for ($i = 0; $i < @filename; $i++) { printf("filename = %s, complete_url = %s\n", $filename[$i] +, $complete_url[$i]); <>; }

The problem is that the printf statement is reporting 0 matches. After reading through the entire file, I want the @complete_url array to contain the complete list of URLs, and the @filename array to contain the complete list of filenames. How can I accomplish this? I realize I might be able to capture just the complete url and derive the filenames from it in a separate step, but for the sake of this discussion how can I capture both the filenames and urls into their respective arrays when there could be multiple matches per line?


In reply to multiple matches per line *AND* multiple capture groups per match by Special_K

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.