Here is a handy command-line tool to quickly view installed
Perl modules whose name matches a specified regular expression.
Features
- Perl regular expression syntax, with separate case-sensitive switch.
- Optional initialization file for faster look-ups.
- Option to print the module name or the full directory path to the module file.
- Option to display duplicate modules and other statistics.
- Uses only core modules.
Other well-known methods
So... why another way to do it?
Simply put: I could not easily convince these other tools to Do What I Want,
as quickly as I want,
in (what I consider) a bug-free manner. Obviously, this is not a new idea;
it is merely a different implementation.
There are many threads here at the Monastery, as a
Super Search would reveal, and I have probably read every node of every thread on the topic.
I believe
HTML::Perlinfo
does everything this script does (and much, much more), except that I could
not easily figure out how to generate output as simple text, rather than HTML.
I consider
HTML::Perlinfo to be a valuable companion to this script.
I run a daily cronjob to dump out recent versions of both HTML and text.
Impatience
In my opinion, the biggest advantage here is the fast look-up capability.
No matter how you slice it, you have to search through the
@INC
directories via some variant of
find, which can take
a whole minute or so -- I just do not have the patience
to wait that long! Maintaining the initialization file avoids that nonsense.
The code
use warnings;
use strict;
use Getopt::Long;
use Pod::Usage;
use File::Find;
my $print_path;
my $report;
my $re;
parse_args();
# Clean up @INC
my @dirs;
for my $dirname (@INC) {
if (-d $dirname) {
next if $dirname eq '.';
$dirname =~ s{/+}{/}g;
$dirname =~ s{/$}{};
push @dirs, $dirname;
}
}
@dirs = uniq(@dirs);
# For quicker operation, use init file, if it exists
my @files;
my $use_find = 1;
my $message;
my $init_file = exists $ENV{HOME} ? "$ENV{HOME}/.findpm" : '';
if (-e $init_file) {
if (open my $fh, '<', $init_file) {
@files = <$fh>;
close $fh;
chomp @files;
my $days = 1;
if (-M $init_file > $days) {
$message = "Warning: $init_file is older than $days day\n"
+;
}
die "Error: $init_file is empty" if -z $init_file;
$use_find = 0;
}
else {
$message = "Warning: $init_file exists, but can not be opened:
+ $!";
}
}
# Otherwise, use the slower find command
if ($use_find) {
# Find all .pm files under @INC dirs
my @find_dirs = reduce_dirs(@dirs);
find(
{
wanted => sub { push @files, $_ if -f $_ and /\.pm$/ },
no_chdir => 1,
},
@find_dirs
);
@files = uniq(@files);
}
# Print those modules/files which match the regex
my %mods;
for my $file (@files) {
my @ds;
for my $dir (@dirs) {
if (index($file, $dir) == 0) {
#print "$d2 is a substring of $d1, starting at pos 0\n"
push @ds, $dir;
}
}
my $d = (sort {length($b) <=> length($a)} @ds)[0];
my $rel = substr($file, (length($d)+1));
my $name = $rel;
$name =~ s/\.pm$//;
next unless $name =~ /$re/;
push @{ $mods{$rel} }, $d;
if ($print_path) {
print "$file\n";
}
else {
$rel =~ s/\.pm$//;
$rel =~ s{/}{::}g;
print "$rel\n";
}
}
if ($report) {
my $num_dups = 0;
for (keys %mods) {
$num_dups++ if (scalar(@{$mods{$_}}) > 1);
}
if ($num_dups) {
print "\nDUPLICATES\n";
for my $rel (keys %mods) {
if (scalar(@{$mods{$rel}}) > 1) {
print "$rel\n";
for my $dir (@{$mods{$rel}}) {
print " $dir/$rel\n";
}
}
}
}
print "\nSUMMARY\n";
print " regex = $re\n";
print " Used '$init_file' init file instead of 'find'\n" unless
+ $use_find;
print " INC dirs:\n";
print " $_\n" for @dirs;
print ' Total ".pm" files = ', scalar @files, "\n"
+;
print ' Matching unique ".pm" files = ', scalar keys %mods,
+"\n";
print ' Matching duplicate ".pm" files = ', $num_dups, "\n";
}
warn $message if $message;
exit;
sub reduce_dirs {
# Reduce a list of directory names by eliminating
# names which contain other names. For example,
# if the input array contains (/a/b/c/d /a/b/c /a/b),
# return an array containing (/a/b).
my @dirs = @_;
my %substring_count = map { $_ => 0 } @dirs;
for my $x (@dirs) {
for my $y (@dirs) {
next if $x eq $y;
if (index($x, $y) == 0) {
# if y is substring of x, starting at position 0
$substring_count{$x}++;
}
}
}
my @dsubs;
for (keys %substring_count) {
push @dsubs, $_ if $substring_count{$_} == 0;
}
return @dsubs;
}
sub uniq {
# From List::MoreUtils, $VERSION = '0.22'
my %h;
map { $h{$_}++ == 0 ? $_ : () } @_;
}
sub parse_args {
my ($help, $sens);
GetOptions(
'sens' => \$sens,
'path' => \$print_path,
'report' => \$report,
'help' => \$help
) or pod2usage();
$help and pod2usage(-verbose => 2);
my $pat = (@ARGV) ? shift @ARGV : '.';
$pat =~ s{::}{/}g;
$re = ($sens) ? qr/$pat/ : qr/$pat/i;
#print "pat=$pat\n";
#print "re=$re\n";#exit;
@ARGV and pod2usage("Error: unexpected args: @ARGV");
}
=head1 NAME
B<findpm> - Find installed Perl modules
=head1 SYNOPSIS
findpm [options] [regex]
Options:
-help verbose help
-path print out full directory paths also
-report print out detailed report
-sens case-sensitive [default is case-insensitive]
=head1 DESCRIPTION
Search through the directories in the Perl C<@INC> variable
for Perl module files (all files with a C<.pm> extension) matching
a specified regular expression.
The names of all the modules which match will be printed to STDOUT.
Any directories listed in C<@INC> which do not exist will be silently
+ignored.
Excludes the current directory (.).
If you are impatient (like I am) you can optionally use an initializat
+ion
file instead of letting the script search through all the C<@INC>
directories every time you run the script. The file must be in your h
+ome
directory and must be named C<.findpm>. You must create this file you
+rself
(see EXAMPLES below), and you should keep it up to date. Since you wi
+ll
get a warning if the init file is more than a day old, I recommend
creating the file using a cron job that runs once a day. If the init
+file
does not exist, the script will proceed to search C<@INC>.
=head1 ARGUMENTS
=over 4
=item regex
An optional regular expression may be given. The regex may be a simpl
+e
string, such as C<foo>, or it may be a more complicated expression, su
+ch as
C<^foo.*bar\d>. The regex syntax is Perl; it should not be confused
with shell wilcard syntax or the syntax for other common Unix utilitie
+s,
such as I<sed> or I<grep>. It is best to quote the regex to prevent
interaction with the shell. Do not include the C<.pm> extension as par
+t of the
regex. If no regex is given, find all modules.
=back
=head1 OPTIONS
All options can be abbreviated.
=over 4
=item sens
By default, the regular expression is case-insensitive. So, if the inp
+ut
regex is C<foo>, it will match C<foo> as well as C<FOO> and C<Foo>, et
+c.
To use case-sensitive, use the C<-sens> option.
findpm -sens foo
=item path
By default, only the module name is printed. To instead print the full
directory path to the module file, use the C<-path> option.
findpm -path foo
=item report
To print out additional statistics, use the C<-report> option.
This will show the total number of matching modules, duplicate modules
+, etc.
findpm -report
=item help
Show verbose usage information.
=back
=head1 EXAMPLES
Find xml modules:
findpm xml
Find modules with case-sensitive "Ext":
findpm -sens Ext
Find modules like File::Find. The following are equivalent because
C<::> will be converted to C</> (similar to I<perldoc>):
findpm 'file::find'
findpm 'file/find'
Find all modules in all C<@INC> directories:
findpm
Create init file:
rm -f ~/.findpm; findpm -path > /tmp/.findpm; mv /tmp/.findpm ~/.f
+indpm
=head1 CONFIGURATION AND ENVIRONMENT
Searches for an optional initialization file in the directory specifie
+d
by the C<HOME> environment variable:
${HOME}/.findpm
=head1 LIMITATIONS
The initialization file is only supported for Unix-type operating syst
+ems.
=cut
Constructive criticism, suggestions for improvements
and bug reports are welcome.
Update: Now only uses core modules.
Update: Avoid potential warning; small change to POD.
Update: find is more portable.