rupesh has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks!
Here's the picture: I have a very large file with a lot of file names in it. Like this:
cln2-test.pl labsearch.txt listprojects.pl server1-test.pl (17).SWW (18).SWW (19).SWW (2).SWW (20).SWW (21).SWW (22).SWW (23).SWW (24).SWW (25).SWW (26).SWW (3).SWW (4).SWW (5).SWW (6).SWW (7).SWW (8).SWW (9).SWW 2689previous.gif 2690next.gif a a.out bbb book_details c1 c2 c3 charcut check_dupvar club common cpgm_check ctest ctest1 ctest2 activepj.pl autopro.exe AutoPro.pl getbylabel.exe getbylabel.pl getbylabel2.exe getbylabel2.pl . . . . . . ..
As you can see, most of the files end with extensions. What i want is to produce a report in this way:
Extension: No. of files .txt 3 .pl 10 .exe 17 . 4 . . .
The file extensions are not pre-determined. They are updated in the log as they are read from the file.
Thanks for your time

we're born with our eyes closed and our mouths wide open, and we spend our entire life trying to rectify that mistake of nature. - anonymous.

Replies are listed 'Best First'.
Re: Hash ref and file extensions
by valdez (Monsignor) on Aug 16, 2003 at 11:52 UTC

    This code should do what you need:

    open(F, '<', './files.txt') or die "open: $!"; while ($filename = <F>) { chomp($filename); $dot = rindex($filename, '.'); if ($dot > -1) { $extension = substr($filename, $dot+1); $extensions{$extension}++; } else { $extensions{'_without_extension'}++; } } close(F); while (($key, $value) = each %extensions) { print "$key -> $value\n"; }

    Ciao, Valerio

    update: thanks liz!

      One small nit. I would replace this code:
      if ($dot > -1) { $extension = substr($filename, $dot+1); $extensions{$extension}++; } else { $extensions{'_without_extension'}++; }
      by:
      $extensions{$dot == -1 ? '' : substr( $filename,$dot )}++;
      for two reasons:
      1. it's more compact
      2. it allows you to differentiate between filenames without extension (no . found) and with an empty extension (a . at the end).
      Liz
Re: Hash ref and file extensions
by submersible_toaster (Chaplain) on Aug 17, 2003 at 01:32 UTC

    Depending on how much of the work you really need to do yourself, File::Basename is a option.

    use File::Basename; use strict; my %exts; open FILE , '/some/big/file' or die "Screaming $!"; while ( <FILE> ) { chomp; my $file = $_; my ($name, $path, $suffix) = fileparse( $file , '\..*' ); $exts{$suffix}++; } # Dump your hash here.
    Of course that is a regex being performed by File::Basename, YMMV.

    Update: Totally untested , looks OK to me.


    I can't believe it's not psellchecked
Re: Hash ref and file extensions
by demerphq (Chancellor) on Aug 17, 2003 at 09:40 UTC

    my %ext; while (<>) { next unless /\S/; /(\.[^.]+)?$/; $ext{lc($1||"")}++; } printf "%-6s %d\n",$_||'<NONE>',$ext{$_} for sort keys %ext;

    Remove the lc() if you are on a case sensitive file system.


    ---
    demerphq

    <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...