Re: Count file lines in a directory tree
by jdporter (Paladin) on Nov 09, 2005 at 00:31 UTC
|
my @extList;
push @extList, $_ for @ARGV;
How is that different from my @extList = @ARGV; (other than being less efficient)?
I'd do
use Getopt::long;
my @extList;
my $root;
GetOptions(
'ext=s' => \@extList,
'root|dir=s' => \$root,
);
@extList = @extList
? map { split /,/ } @extList; # in case of --ext foo,bar
: qw( pl pm );
$root ||= '.';
We're building the house of the future together.
| [reply] [d/l] [select] |
|
|
| [reply] |
Re: Count file lines in a directory tree
by tinita (Parson) on Nov 09, 2005 at 10:01 UTC
|
++$lines while (<inFile>);
why not just use $.?
1 while <inFile>; $lines += $.;
| [reply] [d/l] [select] |
|
|
| [reply] |
|
|
{
local $/ = \131072;
$lines += tr/\n// while <inFile>;
$lines++ if not /\n\z/;
}
Makeshifts last the longest. | [reply] [d/l] |
Re: Count file lines in a directory tree
by Aristotle (Chancellor) on Nov 09, 2005 at 00:44 UTC
|
find -name '*.p[lm]' -print0 | xargs -r0 cat | wc -l
:-)
Update: I was using the unnecessarily longwinded find \( -name '*.pm' -o -name '*.pl' \)
Update: to make this output the number of files, the easiest approach is a bit of Perl:
find -name '*.p[lm]' -print0 \
| perl -00000pe'++$a;END{print STDERR "Files: ",$a||0,", lines: "}' \
| xargs -r0 cat | wc -l
Makeshifts last the longest. | [reply] [d/l] [select] |
|
|
find -name '*.p[lm]' -exec cat {} \; | wc -l
:-)
-sauoq
"My two cents aren't worth a dime.";
| [reply] [d/l] |
|
|
Congratulations, if you have 10,000 matching files, that line makes you a cat herder. ;-) And a herder of useless cats, to boot…
If you’re going to do that, you can just do
find -name '*.p[lm]' -exec wc -l {} \; | awk '{ sum+=$1 } END { print
+sum }'
Moves a bit less data around. It also makes it trivial to count the number of files, as GrandFather’s code does:
find -name '*.p[lm]' -exec wc -l {} \; \
| awk '{ sum+=$1; ++num } END { print sum "lines in" num "files" }'
Nice catch on the glob. :-)
The xargs in mine is quite justified, so I don’t run one process per matched file. The cat is the easiest solution to the problem that if too many files for a single commandline are matched, xargs wc would run the wc multiple times, reporting multiple disjunct totals. This way, xargs runs multiple cats (but very few in total), but wc always run exactly once.
Makeshifts last the longest. | [reply] [d/l] [select] |
Re: Count file lines in a directory tree
by jdporter (Paladin) on Nov 10, 2005 at 03:41 UTC
|
if ! exists $extList[0];
That's unusual. defined would probably be better. But I'd be inclined to write it as
if ! @extList;
or rather,
unless @extList;
We're building the house of the future together.
| [reply] [d/l] [select] |
Re: Count file lines in a directory tree
by Aristotle (Chancellor) on Nov 10, 2005 at 03:58 UTC
|
Okay, since we’re all nitpicking, how about using a better file system iterator module? And some more gravvy?
#!/usr/bin/perl
use strict;
use warnings;
=head1 NAME
countln - recurse directories and counts lines in matching files
=head1 SYNOPSIS
F<countln>
S<B<[ -e ext1,ext2 ]>>
S<B<[ -r rootdir ]>>
=head1 OPTIONS
=over
=item B<-e>, B<--ext>
What extensions to match. Can be given multiple extensions separated b
+y commata, and can be given multiple times. If none given, defaults t
+o F<.pm> and F<.pl>.
=item B<-r>, B<--root>, B<--dir>
Which directory to start recursing in. Defaults to the current directo
+ry
=head1 SEE ALSO
find(1), wc(1)
=head1 BUGS
None known.
=head1 AUTHORS
...
=head1 COPYRIGHT AND LICENCE
...
=back
=cut
use Getopt::Long;
use Pod::Usage;
use File::Find::Rule;
GetOptions(
'h|help' => sub { pod2usage( -verbose => 1 ) },
'man' => sub { pod2usage( -verbose => 2 ) },
'ext|e=s' => \( my @opt_ext ),
'root|dir|r=s' => \( my $opt_root = "." ),
) or pod2usage();
@opt_ext = @opt_ext ? map { split /,/ } @opt_ext : qw( pl pm );
my @file = File::Find::Rule
->file()
->name( map "*.$_", @opt_ext )
->in( $opt_root );
my $lines = 0;
for my $fname ( @file ) {
open my $fh, '<', $fname
or warn( "Couldn't open $fname: $!\n" ), next;
local $/ = \131072;
$lines += tr/\n// while <$fh>;
$lines++ if not /\n\z/;
}
print "$lines lines in " . @file . " files\n";
Makeshifts last the longest. | [reply] [d/l] |
|
|
It was worth posting some, on reflection, slightly silly code just to learn about POD::Usage and, to a lesser extent, File::Find::Rule. Thank you.
Perl is Huffman encoded by design.
| [reply] |
|
|
| [reply] |
Re: Count file lines in a directory tree
by blazar (Canon) on Nov 10, 2005 at 13:23 UTC
|
my $root = rel2abs (shift || '.');
I wouldn't use rel2abs: if I give '.' as an argument, I expect it to be honored. Unless there were some specific option to explicitly instruct the program to do otherwise, that is.
my @extList = @ARGV;
@extList = ('pl', 'pm') if ! exists $extList[0];
How 'bout
my @extList = @ARGV ? @ARGV : qw/pl pm/;
instead?
my $lines = 0;
my $files = 0;
No need for the initializations. Maybe you want them anyway for (your) clarity. For me,
my ($lines, $files); # is clear enough
sub count
{
my $name = $File::Find::name;
return if -d $name;
Maybe in this case you would prefer to use the no_chdir => 1 option to find() which seems more appropriate...
my ($ext) = $name =~ /\.([^.]*)$/;
return if ! defined $ext or ! exists $exts{$ext};
How 'bout
return if grep $name =~ /\.\Q$_$/, @extList;
?
return if ! open inFile, '<', $name;
open my $in, '<', $name or # and I don't need close()
(warn "Ouch: $name => $!\n"), return;
++$lines while (<inFile>);
Hmmm, I always recommend against slurping in whole files (if unnecessary), but perhaps
$lines+=<$in>;
...
| [reply] [d/l] [select] |