in reply to Re: Regex to match file extension in URL
in thread Regex to match file extension in URL

Sorry if that wasn't clear. What I need is a regex to match the extension of the remote file, whatever it is, not just if it's .html. The $myvar was just an example.
  • Comment on Re: Re: Regex to match file extension in URL

Replies are listed 'Best First'.
Re: Re: Re: Regex to match file extension in URL
by Trimbach (Curate) on Sep 09, 2001 at 16:44 UTC
    If that's the case, then how about
    if ($myvar =~ m/\.([^.]+)$/) { print "Matched $1"; }
    Of course, this won't work for URL's with an implicit filename, like "http://www.yahoo.com" or "http://www.somewhere.com/home/" You'll have to catch those bad boys elsewhere in your code.

    Gary Blackburn
    Trained Killer

      Let me break this down and learn :)
      \. # literal period ( # start capture [^.] # any character that's not...any character? + # one or more of them ) # stop capture $ # end of line
      I don't get [^.]. 'Splain? :)
        You got it right, except for the negated character class. Because the metacharacter "." is meaningless inside a character class (because "." and [.] would be the same thing) the period means a literal period inside a character class. So the regex is actually:
        \. # literal period ( # start capture [^.] # any character that's not a period + # one or more of them ) # stop capture $ # end of line
        This guarantees that you'll get the last chunk of non-periods at the end of URL... but beware because like I said "http://www.yahoo.com" will match "com", and "http://www.somewhere.com/home/" will match "com/home/".

        Gary Blackburn
        Trained Killer