I'm trying to work out a regexp which when given either:
$in = '<td><img src="foo.jpg"><a href="index3.html">New index</a></td> +'; or $in = '<td><a href="index3.html">New index</a></td>';
will give me the link data, regardless, and the image data, should there be one. I've tried various combinations after the initial my ($new,$hit) = ($in =~ m#(foo.jpg)?.*(<a href=.*</a>)#m); It looks simple enough, but has stumped a couple of my friends, too. I'm trying to do it in a single regexp - although the actual problem could check for the bits separately, it's got me stumped enough to want an answer, out of curiousity (and doing it in two bits makes the rest of the code more complicated) FWIW, the link data varies, the image data is static.

the hatter

Title edit by tye


In reply to Regexp to extract HTML link data by hatter

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.