Count file lines in a directory tree

Replies are listed 'Best First'.
Re: Count file lines in a directory tree by jdporter (Paladin) on Nov 09, 2005 at 00:31 UTC
`my @extList; push @extList, $_ for @ARGV;` [download] How is that different from `my @extList = @ARGV;` (other than being less efficient)? I'd do `use Getopt::long; my @extList; my $root; GetOptions( 'ext=s' => \@extList, 'root\|dir=s' => \$root, ); @extList = @extList ? map { split /,/ } @extList; # in case of --ext foo,bar : qw( pl pm ); $root \|\|= '.';` [download] We're building the house of the future together.	[reply] [d/l] [select]
Re^2: Count file lines in a directory tree by GrandFather (Saint) on Nov 09, 2005 at 00:50 UTC
Brain fart! :( Thanks for the sanity check. Perl is Huffman encoded by design.	[reply]
Re: Count file lines in a directory tree by tinita (Parson) on Nov 09, 2005 at 10:01 UTC
`++$lines while (<inFile>);` why not just use `$.`? `1 while <inFile>; $lines += $.;`	[reply] [d/l] [select]
Re^2: Count file lines in a directory tree by GrandFather (Saint) on Nov 09, 2005 at 10:06 UTC
Too little coffee early in the day :) Perl is Huffman encoded by design.	[reply]
Re^2: Count file lines in a directory tree by Aristotle (Chancellor) on Nov 10, 2005 at 03:00 UTC
`{ local $/ = \131072; $lines += tr/\n// while <inFile>; $lines++ if not /\n\z/; }` [download] Makeshifts last the longest.	[reply] [d/l]
Re: Count file lines in a directory tree by Aristotle (Chancellor) on Nov 09, 2005 at 00:44 UTC
You mean this? `find -name '.p[lm]' -print0 \| xargs -r0 cat \| wc -l` [download] `:-)` Update: I was using the unnecessarily longwinded `find $ -name '.pm' -o -name '.pl' $` Update: to make this output the number of files, the easiest approach is a bit of Perl: `find -name '.p[lm]' -print0 \ \| perl -00000pe'++$a;END{print STDERR "Files: ",$a\|\|0,", lines: "}' \ \| xargs -r0 cat \| wc -l` [download] Makeshifts last the longest.	[reply] [d/l] [select]
Re^2: Count file lines in a directory tree by sauoq (Abbot) on Nov 09, 2005 at 02:12 UTC
That's a useless use of... xargs. `find -name '*.p[lm]' -exec cat {} \; \| wc -l` [download] :-) -sauoq "My two cents aren't worth a dime.";	[reply] [d/l]
Re^3: Count file lines in a directory tree by Aristotle (Chancellor) on Nov 09, 2005 at 02:25 UTC
Congratulations, if you have 10,000 matching files, that line makes you a `cat` herder. `;-)` And a herder of useless `cat`s, to boot… If you’re going to do that, you can just do `find -name '.p[lm]' -exec wc -l {} \; \| awk '{ sum+=$1 } END { print +sum }'` [download] Moves a bit less data around. It also makes it trivial to count the number of files, as GrandFather’s code does: `find -name '.p[lm]' -exec wc -l {} \; \ \| awk '{ sum+=$1; ++num } END { print sum "lines in" num "files" }'` [download] Nice catch on the glob. `:-)` The `xargs` in mine is quite justified, so I don’t run one process per matched file. The `cat` is the easiest solution to the problem that if too many files for a single commandline are matched, `xargs wc` would run the `wc` multiple times, reporting multiple disjunct totals. This way, `xargs` runs multiple `cat`s (but very few in total), but `wc` always run exactly once. Makeshifts last the longest.	[reply] [d/l] [select]
Re: Count file lines in a directory tree by jdporter (Paladin) on Nov 10, 2005 at 03:41 UTC
`if ! exists $extList[0];` [download] That's unusual. `defined` would probably be better. But I'd be inclined to write it as `if ! @extList;` [download] or rather, `unless @extList;` [download] We're building the house of the future together.	[reply] [d/l] [select]
Re: Count file lines in a directory tree by Aristotle (Chancellor) on Nov 10, 2005 at 03:58 UTC
Okay, since we’re all nitpicking, how about using a better file system iterator module? And some more gravvy? #!/usr/bin/perl use strict; use warnings; =head1 NAME countln - recurse directories and counts lines in matching files =head1 SYNOPSIS F<countln> S<B<[ -e ext1,ext2 ]>> S<B<[ -r rootdir ]>> =head1 OPTIONS =over =item B<-e>, B<--ext> What extensions to match. Can be given multiple extensions separated b +y commata, and can be given multiple times. If none given, defaults t +o F<.pm> and F<.pl>. =item B<-r>, B<--root>, B<--dir> Which directory to start recursing in. Defaults to the current directo +ry =head1 SEE ALSO find(1), wc(1) =head1 BUGS None known. =head1 AUTHORS ... =head1 COPYRIGHT AND LICENCE ... =back =cut use Getopt::Long; use Pod::Usage; use File::Find::Rule; GetOptions( 'h\|help' => sub { pod2usage( -verbose => 1 ) }, 'man' => sub { pod2usage( -verbose => 2 ) }, 'ext\|e=s' => \( my @opt_ext ), 'root\|dir\|r=s' => \( my $opt_root = "." ), ) or pod2usage(); @opt_ext = @opt_ext ? map { split /,/ } @opt_ext : qw( pl pm ); my @file = File::Find::Rule ->file() ->name( map ".$_", @opt_ext ) ->in( $opt_root ); my $lines = 0; for my $fname ( @file ) { open my $fh, '<', $fname or warn( "Couldn't open $fname: $!\n" ), next; local $/ = \131072; $lines += tr/\n// while <$fh>; $lines++ if not /\n\z/; } print "$lines lines in " . @file . " files\n"; [download] Makeshifts last the longest.*	[reply] [d/l]
Re^2: Count file lines in a directory tree by GrandFather (Saint) on Nov 10, 2005 at 04:13 UTC
It was worth posting some, on reflection, slightly silly code just to learn about POD::Usage and, to a lesser extent, File::Find::Rule. Thank you. Perl is Huffman encoded by design.	[reply]
Re^3: Count file lines in a directory tree by Aristotle (Chancellor) on Nov 10, 2005 at 04:18 UTC
My pleasure. `:-)` You probably want to check out The Dynamic Duo --or-- Holy Getopt::Long, Pod::UsageMan! and GetOpt::Long usage style for some prose on the structure I used for the code. Makeshifts last the longest.	[reply]
Re: Count file lines in a directory tree by blazar (Canon) on Nov 10, 2005 at 13:23 UTC
`my $root = rel2abs (shift \|\| '.');` [download] I wouldn't use `rel2abs`: if I give `'.'` as an argument, I expect it to be honored. Unless there were some specific option to explicitly instruct the program to do otherwise, that is. `my @extList = @ARGV; @extList = ('pl', 'pm') if ! exists $extList[0];` [download] How 'bout `my @extList = @ARGV ? @ARGV : qw/pl pm/;` [download] instead? `my $lines = 0; my $files = 0;` [download] No need for the initializations. Maybe you want them anyway for (your) clarity. For me, `my ($lines, $files); # is clear enough` [download] `sub count { my $name = $File::Find::name; return if -d $name;` [download] Maybe in this case you would prefer to use the `no_chdir => 1` option to `find()` which seems more appropriate... `my ($ext) = $name =~ /\.([^.])$/; return if ! defined $ext or ! exists $exts{$ext};` [download] How 'bout `return if grep $name =~ /\.\Q$_$/, @extList;` [download] ? `return if ! open inFile, '<', $name;` [download] `open my $in, '<', $name or # and I don't need close() (warn "Ouch: $name => $!\n"), return;` [download] `++$lines while (<inFile>);` [download] Hmmm, I always recommend against slurping in whole files* (if unnecessary), but perhaps `$lines+=<$in>;` [download] ...	[reply] [d/l] [select]