in reply to Creating a regex to match part of a URL's domain name (was: pattern matching)

Well, putting aside your desire to match the "important" part of a domain name, you probably want to use a regex that says "match and save a set of non-. characters that are followed by a ., then non-. and non-/ characters, and then either a / or the end of the string".

Basically, you want to ensure you're a) only looking at the domain name, and b) getting the penultimate .-separated sequence.

It would probably be more intuitive to use two split()s:
($domain) = split '/', $string; $wanted = (split /\./, $domain)[-2]; # or $wanted = (split /\./, (split '/', $string)[0])[-2];
Here's the regex approach:
($wanted) = $string =~ m{ ( [^.]+ ) # save the non-. sequence to $1 \. # . [^./]+ # the final non-. non-/ sequence (?: / | $) # / or the end of the string }x;


japhy -- Perl and Regex Hacker

Replies are listed 'Best First'.
Re: Re: pattern matching
by strfry() (Monk) on Jun 06, 2001 at 19:50 UTC
    hmm i used part of your code in a subroutine, and it's giving me the error "Use of uninitialized value at ./index.cgi line 29."
    here's the function:
    sub getd { my $string = @_; my $wanted; my $domain; ($domain) = split '/', $string; $wanted = (split /\./, $domain)[-2]; return $wanted; } my $variable = "www.google.com"; print &getd($variable); # this is line 29.

    any ideas?

    strfry()
      You're not doing any rudimentary data-checking, or you'd see that my $string = @_ was assigning a number to your variable.
      # try one of these: my ($string) = @_; my $string = shift; my $string = $_[0];


      japhy -- Perl and Regex Hacker
        aha! but what if i want to use
        my $variable = "http://www.google.com"; ? does this subroutine not handle "/"'s?
Re: Re: pattern matching
by strfry() (Monk) on Jun 06, 2001 at 18:48 UTC
    yes yes yes yes! that's it! thank you! (:
    now all i have to do is fiddle with it until i understand exactly what's taking place hehe
    gracias

    strfry()