As wonderful as regexes are, sometimes they're more trouble than they're worth. The snippet below will do what you want, regardless of whether your links start with http or not. It won't do the right thing if the href targets aren't quoted, or if you have an <a> tag without an href followed by some other tag with an href before the next link, but this is the 5-minute version.

$pos = 0; while ($m = shift @markers) { # locate the beginning of the link last if (($pos = index $htmlfile, '<a', $pos) < 0); # ...then the start of the link's href last if (($pos = index $htmlfile, 'href="', $pos) < 0); # skip past the first " $pos += 6; # ...then the end of the quoted href target last if (($pos = index $htmlfile, '"', $pos) < 0); substr($htmlfile, $pos, 0) = $m; }

At the end, $pos will be -1 and @markers will be empty if you ran out of links before you ran out of markers. If $pos is not -1, do one more index looking for <a and/or href=. If it hits (i.e., does not return -1), you ran out of markers before all of the links were done. If it does return -1, your links and @markers matched up perfectly.

Update: Woops, my ending logic was broken (it's fixed now). The final index check has to be done if $pos is not -1, not just if there's anything left in @markers as I originally stated.


In reply to Re: appending a unique marker to each url in a file by Cubes
in thread appending a unique marker to each url in a file by cat2014

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.