Getting only the link in a line

iphone has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have log file where I have lines like below.Can anyone help in just getting only the link in the link?I have tried the following but it misses the first line.

  ($resolution_link) = $match =~ /(http:.*(\d+))/;
[download]

Bug found in the build. Please check check https://web.com/fluent/x/JIOUAQ for more details.

Bug found in your build please check http://web.com/fixedbuglink/CR2745 for the fix

............

Comment on Getting only the link in a line Download Code

Replies are listed 'Best First'.
Re: Getting only the link in a line by Your Mother (Archbishop) on Nov 04, 2010 at 02:08 UTC
You might also look at URI::Find and friends.	[reply]
Re: Getting only the link in a line by Utilitarian (Vicar) on Nov 03, 2010 at 21:56 UTC
You're potentially matching everything including spaces with the `.*` if there are numbers in the line after the URL , the first line doesn't have decimal digits `\d` at the end of the URL and you need to allow for the possibility of SSL (https) There are better (more precise) solutions, but try `($resolution_link) = $match =~ {(https?://\S+)};` [download] `print "Good ",qw(night morning afternoon evening)[(localtime)[2]/6]," fellow monks."`	[reply] [d/l] [select]
Re^2: Getting only the link in a line by iphone (Beadle) on Nov 03, 2010 at 22:39 UTC
there seems to be some syntax error with the code you provided.I used the below code it worked,but now the problem is there is "."(dot) at the end of somelinks .I need to remove that.How do I do that? `($resolution_link) = $match =~ /(https?:\/\/(\S+))/;` [download]	[reply] [d/l]
Re^3: Getting only the link in a line by Utilitarian (Vicar) on Nov 03, 2010 at 22:42 UTC
That's why I said there were better solutions available ;), this was a quick hack to solve a specific case. One solution would be to remove punctuation at the end of the link in a second pass `print "Good ",qw(night morning afternoon evening)[(localtime)[2]/6]," fellow monks."`	[reply] [d/l]
A reply falls below the community's threshold of quality. You may see it by logging in.
Re: Getting only the link in a line by kcott (Archbishop) on Nov 04, 2010 at 01:11 UTC
This regex handles the data you've provided and the case where the URL has a terminal "`.`": `/(https?:\S+?)[.]?\s/` [download] -- Ken	[reply] [d/l] [select]
Re^2: Getting only the link in a line by iphone (Beadle) on Nov 04, 2010 at 01:42 UTC
Thanks it worked.Can you pls explain how did it take care of the "."(dot)? Thanks	[reply]
Re^3: Getting only the link in a line by kcott (Archbishop) on Nov 04, 2010 at 02:14 UTC
I'll give a quick breakdown here. Refer to perlre for details (I've indicated the appropriate sections). You were originally missing your first line because it was `https` and you'd only specified `http`. The `s?` means zero or more 's's (see Quantifiers). `\S+?` says match all non-whitespace non-greedily which stops it capturing the terminal period if it exists (further down under Quantifiers). `[.]` stops '.' being a special (match anything) character by placing it in a character class (see Metacharacters). `[.]?` just says zero or one non-special '.' (that's Quantifiers again). `\s` at the end anchors the URL (and optional '.') to the whitespace that follows it (see Character Classes and other Special Escapes). -- Ken	[reply] [d/l] [select]
Re^4: Getting only the link in a line by iphone (Beadle) on Nov 04, 2010 at 06:05 UTC
Re^5: Getting only the link in a line by kcott (Archbishop) on Nov 04, 2010 at 06:40 UTC
Some notes below your chosen depth have not been shown here
Re: Getting only the link in a line by poolpi (Hermit) on Nov 04, 2010 at 10:20 UTC
See Regexp::Common::URI::http `#!/usr/bin/perl use strict; use warnings; use Regexp::Common qw /URI/; while(<DATA>){ chomp; /$RE{URI}{HTTP}{-keep}/ and print "Contains an HTTP URI.\nHost=$3\n"; } __DATA__ http://192.168.1.10/index.html http://www.delicious.com/search?p=perl&chk=&context=main\|&fr=del_icio_ +us&lc=` [download] Output: Contains an HTTP URI. Host=192.168.1.10 Contains an HTTP URI. Host=www.delicious.com hth, PooLpi 'Ebry haffa hoe hab im tik a bush'. Jamaican proverb	[reply] [d/l]

Back to Seekers of Perl Wisdom