I have what's probably a simple question for my fellow monks, but Regular Expressions is one of my weaknesses, I am just unable to wrap my head around anything more than only the basics.
my $var =~ m/\w/i;
Thus poses my problem. I need a rather complicated regex, I need to be able to extract a domain name from a string which could be anything from a full url:
http://www.perlmonks.org/?node=Seekers%20of%20Perl%20Wisdom
or an email address, or even a bare string:
www.sub.sub2.domain.com domain.com ftp.domain.co.uk adsl-44-33-22-11.dsl.bcvloh.sbcglobal.net
and I actually would like it to return two results, provided the entered string was more than just domain.com. Using the last line as my example I would need the 2 results to be:
gcvloh.sbcglobal.net sgcglobal.net
also, I'd need to make sure that if an international domain name or URL were given, it checked for it and returned:
some.domain.com.au domain.com.au
Again provided the string was more than just domain.com.br and if only the bare minimum was entered:
domain.com domain.co.uk domain.fm domain.name ..etc, etc..
Now I've searched and read a couple of nodes here, that are very similar to this question, but aren't quite enough for me to work with to achieve my goal. One splits up a domain name domain.com to extract domain and the other only focuses on http:// URLs only, and I've Searched Google and the results I've found again don't quite give enough for me to work with, as I am rather dense when it comes to regex.
Many Thanks Fellow Monks,
jnbek
=== Update ===Looks like actually I have been looking at this from the wrong angle. I have managed to make myself feel like the silly n00b that I am. I only need a regex to strip off extra characters from the front and back, basically between the /'s. Strip off http://|ftp:// etc, then strip the right end / or ? or # then use the pop() function a couple times with a join to get the domain name. So, be it sub1.sub2.sub3.foo.bar.www.domain.com or domain.com I get domain.com to work with. I've only got initial test code with the pop() usage:
And I think I've found a useful regex to work with here. Based on this, anyone have any critique?my $d = "spam.yomama.www.zoelife4u.org"; my @domain = split(/\./, $d); my $tld = pop(@domain); #org my $baredomain = pop(@domain); #zoelife4u my @result = ( $baredomain, $tld ); $maindomain = join("\.", @result); print "End: $maindomain\n;"
In reply to Regex for extracting a domain name from a string. by jnbek
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |