coldfingertips has asked for the wisdom of the Perl Monks concerning the following question:

I have a form that asks for the URL of a file ($url). I need to get the extension of the file and then check to see if it's acceptable by seeing if it exists in the array.

What I have so far is

my @quicktime_ext = qw( .sdp .rtsp .rts .mov .qt .smi .sml .smil .avi .vfw .flc.fl +i .wav .bwf .aiff .aif .aifc .cdda .au .snd .ulw .mid .midi .smf .kar .qcp . +sd2 .amr .gsm .mpeg .meg .m2s .m1v .ma1 .m75 .m15 .mpm .mpv .mpa .3gp .3gpp .3 +g2 .3gp2 .mp4 .mpg4 .m4a .m4p .m4b .m4v .sdv .amc .swa .m3u .m3url .swf .fpx . +fpix .dv .dif); my $ext = $url; $ext =~ /$(\..+)/; $ext = $1;
I need help with making sure this is the proper or best regex to go about this. I just need the ext, not the filename. I also don't know how to then check to see if $ext is within the array. I know I could do it by using a hash instead but that seems like overkill. Hopefully a foreach() isn't needed.

Thanks.

Replies are listed 'Best First'.
Re: file extensions
by sh1tn (Priest) on Apr 07, 2005 at 22:57 UTC
    " I know I could do it by using a hash instead but that seems like overkill." Define overkill please.
    my @quicktime_ext = qw( .sdp .rtsp .rts .mov .qt .smi .sml .smil .avi .vfw .flc.fli .wav .bwf .aiff .aif .aifc .cdda .au .snd .ulw .mid .midi .smf .kar .qcp .sd2 .amr .gsm .mpeg .meg .m2s .m1v .ma1 .m75 .m15 .mpm .mpv .mpa .3gp .3gpp .3g2 .3gp2 .mp4 .mpg4 .m4a .m4p .m4b .m4v .sdv .amc .swa .m3u .m3url .swf .fpx .fpix .dv .dif); my %quicktime_ext = map { $_, 1 } @quicktime_ext; (my $ext = $url) =~ s/.+(\..+)$/$1/; $quicktime_ext{$ext} and print "ok"


Re: file extensions
by tlm (Prior) on Apr 08, 2005 at 00:21 UTC

    coldfingertips, my friend, you must accept File::Basename into your life:

    use File::Basename; #... (fileparse($url, @quicktime_ext))[2] or electrocute_user();
    You don't even have to download it from CPAN; it's part of the standard Perl distribution.

    the lowliest monk

Re: file extensions
by shemp (Deacon) on Apr 07, 2005 at 23:00 UTC
    Why would you think that using a hash is overkill? I think that most would agree that it is the way to go. And you can just do:
    my %quicktime_ext = map { $_ => undef } qw(...);
    so you dont really need to rework your list.
    As for using the array, you'll have to check each element until you find the one you're looking for or until you get to the end of the list. Heres a way without foreach:
    my $found = ''; # overparenthesised for clarity and laziness :) map { $found || ($found = ($ext eq $_)) } @quicktime_ext;
    Of course map is effectively the same as foreach
Re: file extensions
by crashtest (Curate) on Apr 08, 2005 at 00:58 UTC
    I see "search through a list to see if an item exists" and I immediately think grep:
    use strict; use warnings; my $url = "http://someserver/some/file.avi"; my @quicktime_ext = qw( .sdp .rtsp .rts .mov .qt .smi .sml .smil .avi .vfw .flc.fli .wav .bwf .aiff .aif .aifc .cdda .au .snd .ulw .mid .midi .smf .kar .qcp .sd2 .amr .gsm .mpeg .meg .m2s .m1v .ma1 .m75 .m15 .mpm .mpv .mpa .3gp .3gpp .3g2 .3gp2 .mp4 .mpg4 .m4a .m4p .m4b .m4v .sdv .amc .swa .m3u .m3url .swf .fpx .fpix .dv .dif); my ($ext) = grep {$url =~ /$_$/i} @quicktime_ext and print "ok\n"; print "Extension is $ext" if ($ext);
    If you are unfamiliar with grep (the Perl function, not the UNIX tool!), then by all means check the documentation. Both grep and map prove to be extremely useful once you understand them.

    In short, grep sets $_ to each element of the list in turn and executes the code in curly braces. The code in question is a simple regular expression match that asks: "Does the url end with $_?" ($_ being set to one of the extension from your list, like ".avi"). Note that grep isn't necessarily more efficient than a foreach loop. grep implicitly loops over all the elements of the list.
Re: file extensions
by TedPride (Priest) on Apr 08, 2005 at 03:31 UTC
    You don't need to run through an array (or hash) here. You're only matching once per script run, and an index call will work much more efficiently:
    use strict; use warnings; my $url = "http://somes.erver/some/file.avi"; my ($ext) = $url =~ /\.(\w+)$/; my $quicktime_ext = '.sdp .rtsp .rts .mov .qt .smi .sml .smil .avi .vfw ' . '.flc.fli .wav .bwf .aiff .aif .aifc .cdda .au .snd ' . '.ulw .mid .midi .smf .kar .qcp .sd2 .amr .gsm .mpeg ' . '.meg .m2s .m1v .ma1 .m75 .m15 .mpm .mpv .mpa .3gp ' . '.3gpp .3g2 .3gp2 .mp4 .mpg4 .m4a .m4p .m4b .m4v ' . '.sdv .amc .swa .m3u .m3url .swf .fpx .fpix .dv .dif '; print 'Valid ext' if $ext && index($quicktime_ext, ".$ext ") != -1;
Re: file extensions
by holli (Abbot) on Apr 08, 2005 at 09:04 UTC
    If you don't want a hash:
    my @quicktime_ext = qw(sdp rtsp rts mov qt smi sml smil avi vfw flcfli + wav bwf aiff aif aifc cdda au snd ulw mid midi smf kar qcp sd2 amr g +sm mpeg meg m2s m1v ma1 m75 m15 mpm mpv mpa 3gp 3gpp 3g2 3gp2 mp4 mpg +4 m4a m4p m4b m4v sdv amc swa m3u m3url swf fpx fpix dv dif); my $re_quicktime = '\.(' . join ('|', @quicktime_ext) . ')$'; $re_quicktime = qr/$re_quicktime/i; print "OK" if "file.wav" =~ $re_quicktime;


    holli, /regexed monk/
      Or even better:
      my $l = Regexp::List->new; my $re = $l->list2re(qw(sdp rtsp rts mov qt smi sml smil avi vfw flcfl +i wav bwf aiff aif aifc cdda au snd ulw mid midi smf kar qcp sd2 amr gsm mpeg meg m2s m1v ma1 m75 m1 +5 mpm mpv mpa 3gp 3gpp 3g2 3gp2 mp4 mpg4 m4a m4p m4b m4v sdv amc swa m3u m3url swf fpx fpix dv dif)) +; print OK if "file.wav" =~ $re;

      Great! Now the sed programmer breaks way in me and gives another possibility:

      ($name . "\n sdp rtsp rts mov qt smi sml smil avi whatever ") =~ /\.(\ +S*)\n.* \1 / or die "wrong extension"
Re: file extensions
by ambrus (Abbot) on Apr 08, 2005 at 08:34 UTC

    I'd do someting like this:

    use List::Util; $name =~ /\.([^.]*)\z/ and List::Util::first { $1 eq $_ } qw"sdp rtsp rts mov whatever" or die "wrong extension"
    and insert an lc if you want to accept upper-case extensions too.

    Btw, there seems to be a space missing here in your code:

    .vfw .flc.fli .wav