Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Forgive me, my regex knowledge has greatly faded over these last few years (and I didn't really know anything about them in the first place ;). What I'm trying to do is match filenames with a certain extension (say, .html for example). I also want to make sure there are no nasty character combinations (.. being the obvious one) to worry about. I have the luxury of being strict in this case. So the data I want to match will be in the format:

filename.html

Filename will contain alpha numeric (a-zA-Z0-9) characters, underscore, hyphen, and any other obviously valid filename characters. The ".html" extension in this case can be literal. So this is what I have:

/^([a-zA-Z\d_-]+)[.html]$/

Any input is greatly appreciated (:

Replies are listed 'Best First'.
Re: Simple filename regex help
by Tomte (Priest) on Jun 21, 2003 at 10:38 UTC

    /^([a-zA-Z\d_-]+)[.html]$/

    [.html] is a character class, so it won't match the whole fileextension, but only the first character of it, and because of the $ no name with any extension will match.

    /^([a-z0-9_-]+)\.html$/i

    should do the trick, notice the i (ignore case), to make the regexp more readable IMVHO.

    Edit:added \ to the regexp.

    regards,
    tomte


    Hlade's Law:

    If you have a difficult task, give it to a lazy person --
    they will find an easier way to do it.

      Thanks for the explanation :)

Re: Simple filename regex help
by little (Curate) on Jun 21, 2003 at 10:35 UTC
    /^([a-zA-Z\d_-]+)\.html$/

    a single dot would match one or none character, but only by chance another dot.

    Have a nice day
    All decision is left to your taste

      That's what I was looking for, thanks :)

Re: Simple filename regex help
by vek (Prior) on Jun 21, 2003 at 15:41 UTC

    You can also use File::Basename to find the filename, directory, extension:

    my ($name, $dir, $ext) = fileparse($filepath, '\..*');

    -- vek --