BalochDude has asked for the wisdom of the Perl Monks concerning the following question:

Q: How do i remove file-extensions from an array of file names.
There could be files with any extensions (.html, .jpg, .psd, .txt .......).

Description of the problem:
E.g.
suppose i have list of all the files in the following array.
@files = ("one.zip","twotwo.doc","three3.ppt");
and i need the following .....
@onlyNames = ("one","twotwo","three3"); @onlyExt = ("zip","doc","ppt");<br>

jdporter - added code formatting

Replies are listed 'Best First'.
Re: Removing File Extensions
by dragonchild (Archbishop) on May 01, 2004 at 19:31 UTC
    This is a solved problem, so use the provided solution. This should work on any system. File::Basename is part of the core.
    use File::Basename; my @files = ("one.zip","twotwo.doc","three3.ppt"); my @onlyNames; my @onlyExt; foreach my $file (@files) { my ($name, $path, $suffix) = fileparse( $file, qr/\.[^.]+$/ ); $suffix =~ s/\.//; push @onlyNames, $name; push @onlyExt, $suffix; }

    Update: Fixed typo. (Thanks, Not_a_Number!)

    ------
    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

      I applaud your use of a prepackaged module, but I personally think it's a bit overkill in this situation.

      my @files = qw( foo.zip twotwo.doc three.one.four.ppt four ); my ( @only_names, @only_exts ); foreach ( @files ) { if ( m/^ (.*?) (?:\.([^.]+))? $/x ) { push @only_names, $1; push @only_exts, $2; # might be undef } else { warn "couldn't parse '$!'"; } } # only for debugging output foreach ( @only_exts ) { $_ = 'undef' unless defined $_ } print "names = [ @only_names ]\n", "exts = [ @only_exts ]\n";

      Which produced:

      names = [ foo twotwo three.one.four four ] exts = [ zip doc ppt undef ]

      I guess I feel that, since your solution even with a standard module also requires crafting a regex, I'd just as soon write a regex for the whole thing. *shrug*

      My main concerns with my solution would be whether the regex is efficient enough — the non-greedy filename part worries me a bit — and the usual "well, you might be able to craft such a regex, but the average programmer..." complaints.

        The point here isn't that either solution would require a regex or not. The goal is to teach someone that the issues one generally runs into have been run into many times before. There is often a wheel that's round enough out there.

        ------
        We are the carpenters and bricklayers of the Information Age.

        Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

Re: Removing File Extensions
by pizza_milkshake (Monk) on May 01, 2004 at 19:47 UTC
    #!perl -wl use strict; my (@f, @e, %f, %e); map { $f{$1} = $e{$2} = !undef if /(.*)\.(.*)/ } glob("*"); @f = keys %f; @e = keys %e; print "filenames: @f"; print "extensions: @e";

    perl -e'$_="nwdd\x7F^n\x7Flm{{llql0}qs\x14";s/./chr(ord$&^30)/ge;print'

Re: Removing File Extensions
by Anonymous Monk on May 01, 2004 at 18:58 UTC
    my @files = qw( one.zip twotwo.doc three3.ppt ); my (@only_names, @only_ext); for (@files) { my ($name, $ext) = $_ =~ m!\A(.*)\.([^.]+)\z!s; push( @only_names, $name ); push( @only_ext, $ext ); }
      This fails if the file has no extension. Probably a simple test for a match failure would fix this:
      my($name,$ext); ($name, $ext) = m!\A(.*)\.([^.]+)\z!s or $name = $_,$ext='';
      Q1: If i only want to get the name. Is the following code correct.
      Replace
      my ($name, $ext) = $_ =~ m!\A(.*)\.([^.]+)\z!s;
      with
      my ($name) = $_ =~ m!\A(.*)\z!s;


      Q2: Could you please explain this line
      ($name, $ext) = $_ =~ m!\A(.*)\.([^.]+)\z!s;
      Not in detail , i just want some idea.
      I am really impressed with the way i got a reply. it was very quick.

      jdporter - added code formatting

Re: Removing File Extensions
by jdporter (Paladin) on May 03, 2004 at 03:53 UTC
    Assuming all filenames in the list have extensions, and that it isn't necessary to preserve the order of items:
    my %h = map /(.*)\.(.*)/, @files; @onlyNames = keys %h; @onlyExt = @h{@onlyNames};
Re: Removing File Extensions
by Wassercrats (Initiate) on May 02, 2004 at 00:25 UTC
    @files = ("one.zip","twotwo.doc","three3.ppt"); map {$_ =~ /(.*)\./; push (@onlyNames, $1); push (@onlyExt, $');} @fil +es;
      The above fails if there's no extension too.

      This works:

      @files = ("one.zip","twotwo.doc","three3.ppt","testfour"); map {$_ =~ /([^.]*)\.?/; push (@onlyNames, $1); push (@onlyExt, $');} +@files;
        This works:

        @files = ("one.zip","twotwo.doc","three3.ppt","testfour"); map {$_ =~ /([^.]*)\.?/; push (@onlyNames, $1); push (@onlyExt, $');} +@files;

        No, it doesn't always work. It fails if the filename is part of a complete path.

        Consider the following input, which will cause a complete path instead of a filename to be pushed into your @onlyNames array:

        /usr/bin/perl /home/users/d/davido/text.txt C:\Perl\scripts\mytest.pl

        In each of those cases, you'll capture into @onlyNames the following:

        /usr/bin/perl /home/users/d/davido/text C:\Perl\scripts\mytest

        It gets even worse if one of the directory names in the path contains a dot (.)

        Consider what happens if the input looks like this:

        C:\Perl\scripts.old\mytest.pl

        In that case, @onlyNames will contain:

        C:\Perl\scripts

        And clearly that's not a filename.

        Update: Just thought of another situation that may reveal a bug: What if the filename has multiple extensions? In this case, it would be up to the OP to define what constitutes an extension, and what constitutes a filename. But by way of example, your solution would turn:

        bigfile.tar.gz

        ...into...

        bigfile

        The danger here is that while .gz represents the extension, .tar represents a second layer of 'extension'. But the name of the .gz file really is bigfile.tar, since we're dealing with multiple layers of processing on the said file.


        Dave

        A reply falls below the community's threshold of quality. You may see it by logging in.