in reply to Re: Getting only the link in a line
in thread Getting only the link in a line

Thanks it worked.Can you pls explain how did it take care of the "."(dot)? Thanks

Replies are listed 'Best First'.
Re^3: Getting only the link in a line
by kcott (Archbishop) on Nov 04, 2010 at 02:14 UTC

    I'll give a quick breakdown here. Refer to perlre for details (I've indicated the appropriate sections).

    • You were originally missing your first line because it was https and you'd only specified http. The s? means zero or more 's's (see Quantifiers).
    • \S+? says match all non-whitespace non-greedily which stops it capturing the terminal period if it exists (further down under Quantifiers).
    • [.] stops '.' being a special (match anything) character by placing it in a character class (see Metacharacters).
    • [.]? just says zero or one non-special '.' (that's Quantifiers again).
    • \s at the end anchors the URL (and optional '.') to the whitespace that follows it (see Character Classes and other Special Escapes).

    -- Ken

      1.That's a very detailed explanation.but one question,why should we make "." a non-special character?even without it I am able to remove the "." 2.i understand what \S+? does but fail to understand how it removes the terminal period?

        When you want to match a '.' exactly, you need to remove the special meaning. [.] is one way, \. is another. Here's what happens if you don't remove the special meaning and the URL doesn't have a trailing '.':

        $ perl -wE 'while (<>) { my ($x) = $_ =~ /(https?:\S+?).?\s/; say $x; +}' Bug found in the build. Please check check https://web.com/fluent/x/JI +OUAQ for more details. https://web.com/fluent/x/JIOUA

        In this instance it matches any character; there's no '.' so it matches the 'Q' which you can see has been removed in the output.

        The \S+? does not remove the '.', it captures up to the '.'.

        I've been saying any character throughout - it actually doesn't match a newline unless you use the 's' modifier - see Modifiers in perlre.

        -- Ken