This is a file line counter. It recursively searches for files matching given extensions and counts the total number of matched files and lines.

By default it searches using the current directory as the root and matches .pl and .pm files. If given, the first command line argument is the root folder and any following arguments are the file extensions to match (excluding the .).

Note that it will be report one line too few for each file that doesn't have a terminating new line.

use warnings; use strict; use File::Find; use File::Spec::Functions qw(rel2abs); my $root = rel2abs (shift || '.'); my @extList = @ARGV; @extList = ('pl', 'pm') if ! exists $extList[0]; my %exts; @exts{@extList} = (); my $lines = 0; my $files = 0; find (\&count, $root); print "$lines lines, $files files"; sub count { my $name = $File::Find::name; return if -d $name; my ($ext) = $name =~ /\.([^.]*)$/; return if ! defined $ext or ! exists $exts{$ext}; return if ! open inFile, '<', $name; ++$files; ++$lines while (<inFile>); close inFile; }

Update: fix ugly assignement per jdporter's reply.


Perl is Huffman encoded by design.

Replies are listed 'Best First'.
Re: Count file lines in a directory tree
by jdporter (Paladin) on Nov 09, 2005 at 00:31 UTC

    my @extList; push @extList, $_ for @ARGV;
    How is that different from  my @extList = @ARGV; (other than being less efficient)?

    I'd do

    use Getopt::long; my @extList; my $root; GetOptions( 'ext=s' => \@extList, 'root|dir=s' => \$root, ); @extList = @extList ? map { split /,/ } @extList; # in case of --ext foo,bar : qw( pl pm ); $root ||= '.';

    We're building the house of the future together.

      Brain fart! :( Thanks for the sanity check.


      Perl is Huffman encoded by design.
Re: Count file lines in a directory tree
by tinita (Parson) on Nov 09, 2005 at 10:01 UTC
    ++$lines while (<inFile>);
    why not just use $.?
    1 while <inFile>; $lines += $.;

      Too little coffee early in the day :)


      Perl is Huffman encoded by design.
      { local $/ = \131072; $lines += tr/\n// while <inFile>; $lines++ if not /\n\z/; }

      Makeshifts last the longest.

Re: Count file lines in a directory tree
by Aristotle (Chancellor) on Nov 09, 2005 at 00:44 UTC

    You mean this?

    find -name '*.p[lm]' -print0 | xargs -r0 cat | wc -l

    :-)

    Update: I was using the unnecessarily longwinded find \( -name '*.pm' -o -name '*.pl' \)

    Update: to make this output the number of files, the easiest approach is a bit of Perl:

    find -name '*.p[lm]' -print0 \ | perl -00000pe'++$a;END{print STDERR "Files: ",$a||0,", lines: "}' \ | xargs -r0 cat | wc -l

    Makeshifts last the longest.

      That's a useless use of... xargs.

      find -name '*.p[lm]' -exec cat {} \; | wc -l
      :-)

      -sauoq
      "My two cents aren't worth a dime.";
      

        Congratulations, if you have 10,000 matching files, that line makes you a cat herder. ;-) And a herder of useless cats, to boot…

        If you’re going to do that, you can just do

        find -name '*.p[lm]' -exec wc -l {} \; | awk '{ sum+=$1 } END { print +sum }'

        Moves a bit less data around. It also makes it trivial to count the number of files, as GrandFather’s code does:

        find -name '*.p[lm]' -exec wc -l {} \; \ | awk '{ sum+=$1; ++num } END { print sum "lines in" num "files" }'

        Nice catch on the glob. :-)

        The xargs in mine is quite justified, so I don’t run one process per matched file. The cat is the easiest solution to the problem that if too many files for a single commandline are matched, xargs wc would run the wc multiple times, reporting multiple disjunct totals. This way, xargs runs multiple cats (but very few in total), but wc always run exactly once.

        Makeshifts last the longest.

Re: Count file lines in a directory tree
by jdporter (Paladin) on Nov 10, 2005 at 03:41 UTC
    if ! exists $extList[0];

    That's unusual. defined would probably be better. But I'd be inclined to write it as

    if ! @extList;
    or rather,
    unless @extList;

    We're building the house of the future together.
Re: Count file lines in a directory tree
by Aristotle (Chancellor) on Nov 10, 2005 at 03:58 UTC

    Okay, since we’re all nitpicking, how about using a better file system iterator module? And some more gravvy?

    #!/usr/bin/perl use strict; use warnings; =head1 NAME countln - recurse directories and counts lines in matching files =head1 SYNOPSIS F<countln> S<B<[ -e ext1,ext2 ]>> S<B<[ -r rootdir ]>> =head1 OPTIONS =over =item B<-e>, B<--ext> What extensions to match. Can be given multiple extensions separated b +y commata, and can be given multiple times. If none given, defaults t +o F<.pm> and F<.pl>. =item B<-r>, B<--root>, B<--dir> Which directory to start recursing in. Defaults to the current directo +ry =head1 SEE ALSO find(1), wc(1) =head1 BUGS None known. =head1 AUTHORS ... =head1 COPYRIGHT AND LICENCE ... =back =cut use Getopt::Long; use Pod::Usage; use File::Find::Rule; GetOptions( 'h|help' => sub { pod2usage( -verbose => 1 ) }, 'man' => sub { pod2usage( -verbose => 2 ) }, 'ext|e=s' => \( my @opt_ext ), 'root|dir|r=s' => \( my $opt_root = "." ), ) or pod2usage(); @opt_ext = @opt_ext ? map { split /,/ } @opt_ext : qw( pl pm ); my @file = File::Find::Rule ->file() ->name( map "*.$_", @opt_ext ) ->in( $opt_root ); my $lines = 0; for my $fname ( @file ) { open my $fh, '<', $fname or warn( "Couldn't open $fname: $!\n" ), next; local $/ = \131072; $lines += tr/\n// while <$fh>; $lines++ if not /\n\z/; } print "$lines lines in " . @file . " files\n";

    Makeshifts last the longest.

      It was worth posting some, on reflection, slightly silly code just to learn about POD::Usage and, to a lesser extent, File::Find::Rule. Thank you.


      Perl is Huffman encoded by design.
Re: Count file lines in a directory tree
by blazar (Canon) on Nov 10, 2005 at 13:23 UTC
    my $root = rel2abs (shift || '.');

    I wouldn't use rel2abs: if I give '.' as an argument, I expect it to be honored. Unless there were some specific option to explicitly instruct the program to do otherwise, that is.

    my @extList = @ARGV; @extList = ('pl', 'pm') if ! exists $extList[0];

    How 'bout

    my @extList = @ARGV ? @ARGV : qw/pl pm/;

    instead?

    my $lines = 0; my $files = 0;

    No need for the initializations. Maybe you want them anyway for (your) clarity. For me,

    my ($lines, $files); # is clear enough
    sub count { my $name = $File::Find::name; return if -d $name;

    Maybe in this case you would prefer to use the no_chdir => 1 option to find() which seems more appropriate...

    my ($ext) = $name =~ /\.([^.]*)$/; return if ! defined $ext or ! exists $exts{$ext};

    How 'bout

    return if grep $name =~ /\.\Q$_$/, @extList;

    ?

    return if ! open inFile, '<', $name;
    open my $in, '<', $name or # and I don't need close() (warn "Ouch: $name => $!\n"), return;
    ++$lines while (<inFile>);

    Hmmm, I always recommend against slurping in whole files (if unnecessary), but perhaps

    $lines+=<$in>;

    ...