in reply to Getting only the link in a line

This regex handles the data you've provided and the case where the URL has a terminal ".":

/(https?:\S+?)[.]?\s/

-- Ken

Replies are listed 'Best First'.
Re^2: Getting only the link in a line
by iphone (Beadle) on Nov 04, 2010 at 01:42 UTC

    Thanks it worked.Can you pls explain how did it take care of the "."(dot)? Thanks

      I'll give a quick breakdown here. Refer to perlre for details (I've indicated the appropriate sections).

      • You were originally missing your first line because it was https and you'd only specified http. The s? means zero or more 's's (see Quantifiers).
      • \S+? says match all non-whitespace non-greedily which stops it capturing the terminal period if it exists (further down under Quantifiers).
      • [.] stops '.' being a special (match anything) character by placing it in a character class (see Metacharacters).
      • [.]? just says zero or one non-special '.' (that's Quantifiers again).
      • \s at the end anchors the URL (and optional '.') to the whitespace that follows it (see Character Classes and other Special Escapes).

      -- Ken

        1.That's a very detailed explanation.but one question,why should we make "." a non-special character?even without it I am able to remove the "." 2.i understand what \S+? does but fail to understand how it removes the terminal period?