I was going to propose, as one more alternative, Regexp::Common, but when I did quick test of it, I discovered that it gives somewhat wrong undesired results with some URLs (note the trailing non-URL characters parentheses, commas, semicolons, etc. in some of returned URLs):

% wget -qO - http://www.ebay.com | perl -MRegexp::Common=URI -wnle 'print $1 while /($RE{URI}{HTTP})/g'|h +ead http://include.ebaystatic.com/js/v/us/homepage.js http://include.ebaystatic.com/aw/pics/us/css/homepage.css http://pics.ebaystatic.com/aw/pics/userSitePrefs/bottomDropShadow_20x2 +0.gif) http://pics.ebaystatic.com/aw/pics/userSitePrefs/sideDropShadow_20x20. +gif) http://pics.ebaystatic.com/aw/pics/userSitePrefs/dropshadow2_20x10.gif +) http://include.ebaystatic.com/aw/pics/css/ebay.css http://include.ebaystatic.com/'; http://include.ebaystatic.com/js/v/us/ebaybase.js http://include.ebaystatic.com/js/v/us/ebaysup.js http://search.ebay.com/',
...while URI::Find::Rule does a better job DWIM:
% wget -qO - http://www.ebay.com | perl -MURI::Find::Rule -wlne ' print $_->[1] for URI::Find::Rule->scheme("http")->in($_)'|head http://include.ebaystatic.com/js/v/us/homepage.js http://include.ebaystatic.com/aw/pics/us/css/homepage.css http://pics.ebaystatic.com/aw/pics/userSitePrefs/bottomDropShadow_20x2 +0.gif http://pics.ebaystatic.com/aw/pics/userSitePrefs/sideDropShadow_20x20. +gif http://pics.ebaystatic.com/aw/pics/userSitePrefs/dropshadow2_20x10.gif http://include.ebaystatic.com/aw/pics/css/ebay.css http://include.ebaystatic.com/ http://include.ebaystatic.com/js/v/us/ebaybase.js http://include.ebaystatic.com/js/v/us/ebaysup.js http://search.ebay.com/

Update: Fixed the incorrect wording. As merlyn pointed out, the unwanted trailing characters are valid URL characters. Still I think they could be a problem in the case of the application the OP described. Therefore, in this case, R::C is not the most straighforward solution.

the lowliest monk


In reply to Re^2: Having hyperlinks in comments by tlm
in thread Having hyperlinks in comments by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.