in reply to Regexes and URIs

I would suggest using the URI module to do this safely. Otherwise, you run the risk of not matching it correctly. I don't think this method will work for all cases:
($file) = $URI =~ m{ ^ (?: https? | ftp ) :// # scheme [^/]+ # domain (?: / [^/?#]* )* # directories / ( [^/?#]* ) # filename (?: $ | [?#] ) }x;
I do not advocate using the regex I just made. I didn't even test it. I doubt it works reliably.

japhy -- Perl and Regex Hacker

Replies are listed 'Best First'.
Re (tilly) 2: Regexes and URIs
by tilly (Archbishop) on Mar 30, 2001 at 04:23 UTC
    Indeed. If you make the mistake of using that RE you will have broken code. It will look right to you. It will work in your tests. But if someone like me comes along who knows how to put names and passwords in URLs, it will break and I won't be happy.

    Put names and passwords in URLs? Most people don't know that you can do that. But try it:

    http://name:password@www.company.com/whatever/to/get.html
    Substitute in a name and password you use. Substitute in a protocol like ftp if that is easier. Give it a shot from your browser, LWP::Simple, etc.

    This pattern is in the spec. It will work with any tool that I have ever tried. It will work with every protocol. If it does not work with your tool, then that is a bug.

    This is why japhy would have used the standard library. He doesn't know the spec off of the top of his head. He knows he doesn't. And rather than finding it and having to figure out how to do the whole thing correctly, he can just use an existing library and be confident that it will Just Work. By contrast his off-the-cuff solution will work for 99% of the domain space, but (exactly as he predicted) will break somewhere...

    The goal is be right with as little work as possible. So use the module.