Re: find module name from the module archive (in @INC) (updated)
by LanX (Saint) on Dec 09, 2016 at 19:04 UTC
|
hmm just a quick remark:
Calling use will also execute the import function which might do unexpected things.
You might prefer require (or even do ) which are just part of use.
(mnemonic use > require > do > eval with > read as "extends")
Furthermore you should probably check the error code to avoid false negatives.
There are many reasons why a code execution might fail.
Saying so, a function to check @INC without eval-ing the located code would be safer°. (especially if you don't want to activate the whole dependency chain with consecutive use calls)
update
°) The page for require has sample code showing how @INC is searched, just skip the part after do($realfilename); to avoid compilation and add the logic for ref($prefix) if needed. (alternatively see also these modules)
You might want to use this as a first step and doing your eval only if it's not found in a second step (in order to play safe). Like this you will only execute module code if the parsing mechanism fails (which shouldn't be ever the case and deserves a warning)
NB: This approach answers the question "Can a module be found in @INC" which is slightly different to "Is a module distribution installed" since:
- @INC could have been manipulated and
- distribution often include multiple modules.
| [reply] [d/l] [select] |
Re: find module name from the module archive
by Corion (Patriarch) on Dec 09, 2016 at 19:08 UTC
|
| [reply] |
Re: find module name from the module archive (CPAN Module::XXX )
by LanX (Saint) on Dec 09, 2016 at 19:57 UTC
|
| [reply] |
Re: find module name from the module archive
by haukex (Archbishop) on Dec 09, 2016 at 18:57 UTC
|
Hi Lotus1,
I am not a CPAN expert, but based on what I've seen I'm not sure if there is necessarily a relationship between the name of the CPAN package and the modules contained within. Take for example LWP: it's in libwww-perl. The only two things I can think of at the moment are to scan the lib/ subdirectory within the .tar.gz, but that might not be perfectly reliable, or to look at http://www.cpan.org/modules/02packages.details.txt (gzipped) to figure out which modules are in which packages. Both don't "feel" particularly clean, so since it's Friday evening it's very possible there's something better I'm missing ;-) (Update: Yep, see Corion's post)
Update: I briefly checked on how CPAN tests for whether a module is installed or not, and from what I can tell it simply scans @INC for the file (using the usual pattern that Foo::Bar::Quz becomes Foo/Bar/Quz.pm).
Regards, -- Hauke D
| [reply] [d/l] [select] |
|
|
Thanks for the suggestion about 02packages.details.txt. I forgot to mention that I'm in a behind the firewall situation where I can't use CPAN so I've written an install script to run through the Makefile.PL, and dmake steps for me. But after a few years and a mix of old and new servers I need to check which modules are already installed. In the near future I plan to add my own local CPAN mirror.
| [reply] [d/l] |
Re: find module name from the module archive
by shmem (Chancellor) on Dec 09, 2016 at 20:07 UTC
|
Since the *.pm files live in the subdirectory lib of a CPAN package, I'd do something like
#!/usr/bin/perl
use strict; use warnings;
use File::Temp qw(tempdir);
use File::Find;
use File::Spec;
my $module_tgz = shift || die "usage: $0 pkgfile\n";
my $dir = tempdir(CLEANUP => 1);
system "tar -C $dir -xzf $module_tgz"
and die "untaring $module_tgz failed\n";
my ($pkg) = do {
opendir my($dh),$dir;
grep !/^\.\.?$/, readdir $dh
};
my $pkgdir = File::Spec->catfile($dir,$pkg);
my $libdir = File::Spec->catfile($dir,$pkg,'lib');
my $makefile = File::Spec->catfile($dir,$pkg,'Makefile.PL');
my $module;
if ( -e $makefile) {
open my $fh, '<', $makefile or die "Can't read $makefile\n";
FH: while(<$fh>) {
if ( /WriteMakefile/ .. /\);/) {
/NAME\S?\s*=>\s(?:["']|q+(.))([\w:]+)(?:["']|q+(.))/
and print "Module: ", ($module = $2),"\n"
and last FH;
}
}
}
print "Module ",($module || '(unknown)')," provides:\n";
find(\&get_mods,$libdir);
sub get_mods {
if ((my $file = $File::Find::name) =~ s/\.pm$//) {
$file =~ s{.*lib/}{};
$file =~ s{/}{::}g;
print " $file\n";
}
}
...but yes, the approach to extract the module name from the named parameters passed to WriteMakefile probably fails but for the most common case, in which WriteMakefile is passed a named parameter list with the key NAME and it's value on a single line as in:
NAME => q(Foo::Bar),
If extraction of the module fails, then probably the Module name is the shortest one displayed by find(). Or else...
But then, for Debian based Linux distros, there is the debhelper suite, which includes code to extract the package name from a CPAN package. I'm just too lazy to poke around there (not for whipping up a script, though :-)
perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
| [reply] [d/l] [select] |
Re: find module name from the module archive
by RonW (Parson) on Dec 09, 2016 at 23:32 UTC
|
Since a properly created package has a MANIFEST file, you could extract just that, then use Module::Locate to test if the modules in the package have been installed.
Disclaimer: Not tested. YMMV
#!perl
use strict;
use warnings;
use Archive::Tar;
use Module::Locate qw/ locate /;
my $tar = Archive::Tar->new;
$tar->read('origin.tgz', COMPRESS_GZIP, {filter => 'MANIFEST'});
my @files = split /[\r\n]/, $tar->get_content( 'MANIFEST' );
my @modfiles = grep s|lib/||, @files;
my @modules = map s|/|::|, @modfiles;
for (@modules)
{
my $p = locate($_);
if ($p)
{
print "Found '$_' at '$p'\n";
}
else
{
print "Not installed: '$_'\n";
}
}
| [reply] [d/l] |
Re: find module name from the module archive
by kcott (Archbishop) on Dec 11, 2016 at 02:36 UTC
|
G'day Lotus1,
[Firstly, I'm a bit late to the party with this.
I started to write some code yesterday, was interrupted, and only got back to it about 24 hours later.]
"This seems likely to break in future use."
There are a number of issues, especially if you were hoping to turn this into a general solution.
Here are the main ones as I see them.
I acknowledge that, in some cases, I'm repeating what others have already said.
-
CPAN tarballs often contain more than one module.
-
The tarball name does not necessarily reflect the name(s) of the module(s) therein.
-
All tarballs are not *.tar.gz files; for instance, some are *.tar.bz2 files.
-
The path to the *.pm file(s) is not necessarily found under a common, top-level lib directory.
-
There may be multiple lib directories.
-
The lib directories may contain files other than *.pm files.
For testing, I chose the following distributions:
- libwww-perl-6.15
-
This was already mentioned.
It contains multiple modules.
The tarball name (libwww-perl-6.15.tar.gz) does not reflect the module names
(all of which are in the LWP namespace).
This does have a single, common, top-level lib directory containing only *.pm files.
See its MANIFEST file for details.
- Qt4-0.99.0
-
This was useful as it demonstrated all the points I made above:
bzip tarball (Qt4-0.99.0.tar.bz2); multiple modules (which are neither called Qt4 nor in a Qt4 namespace); there's no common, top-level lib directory; there are multiple lib subdirectories which contain *.pm and other files.
See its MANIFEST file for details.
- Authen-SASL-Perl-NTLM-0.003
-
This was what you were using.
It was mainly for comparison: it doesn't demonstrate any of the points I made above.
See its MANIFEST file for details.
For additional testing, I made .tar.bz2, .tgz and .tar versions of libwww-perl-6.15
and .tar.gz, .tgz and .tar versions of Qt4-0.99.0.
Here's my test code:
#!/usr/bin/env perl
use 5.014;
use strict;
use warnings;
use Archive::Tar;
use List::Util qw{sum0};
use Test::More;
my @test_data = (
{
tarball_name_base => 'libwww-perl-6.15',
filename_extensions => [qw{.tar.gz .tar.bz2 .tgz .tar}],
expected_modules => [qw{
LWP LWP::Authen::Basic LWP::Authen::Digest LWP::Authen::Nt
+lm
LWP::ConnCache LWP::Debug LWP::DebugFile LWP::MemberMixin
LWP::Protocol LWP::Protocol::GHTTP LWP::Protocol::cpan
LWP::Protocol::data LWP::Protocol::file LWP::Protocol::ftp
LWP::Protocol::gopher LWP::Protocol::http
LWP::Protocol::loopback LWP::Protocol::mailto
LWP::Protocol::nntp LWP::Protocol::nogo
LWP::RobotUA LWP::Simple LWP::UserAgent
}],
},
{
tarball_name_base => 'Qt4-0.99.0',
filename_extensions => [qw{.tar.bz2 .tar.gz .tgz .tar}],
expected_modules => [qw{
Phonon QImageBlitz Qsci Qt3Support4 QtCore4 QtCore4::class
+info
QtCore4::debug QtCore4::isa QtCore4::signals QtCore4::slot
+s
QtDBus4 QtDeclarative4 QtGui4 QtHelp4 QtMultimedia4 QtNetw
+ork4
QtOpenGL4 QtScript4 QtSql4 QtSvg4 QtTest4 QtUiTools4 QtWeb
+Kit4
QtXml4 QtXmlPatterns4 Qwt
}],
},
{
tarball_name_base => 'Authen-SASL-Perl-NTLM-0.003',
filename_extensions => [qw{.tar.gz}],
expected_modules => [qw{Authen::SASL::Perl::NTLM}],
},
);
plan tests => sum0 map { scalar @{$_->{filename_extensions}} } @test_d
+ata;
my $map_re = qr{(?x: ^ .*? \b lib / ( [^.]+ ) )};
my $grep_re = qr{(?x: ^ .*? \b lib / [^.]+ [.] pm )};
for my $tarball_test (@test_data) {
for my $filename_extension (@{$tarball_test->{filename_extensions}
+}) {
my $tarball = $tarball_test->{tarball_name_base} . $filename_e
+xtension;
my $tar = Archive::Tar::->new();
$tar->read($tarball, '', {filter => qr{(?x: ^ [^/]* /? MANIFES
+T $ )}});
my @manifest_lines = split /\R+/, $tar->get_content($tar->list
+_files());
my @modules = map { (/$map_re/)[0] =~ s{/}{::}gr }
grep { /$grep_re/ } @manifest_lines;
is("@{[sort @modules]}",
"@{[sort @{$tarball_test->{expected_modules}}]}",
"Testing: $tarball");
}
}
All of the expected_modules lists were taken directly from the distribution pages already linked to.
Here's the results:
1..9
ok 1 - Testing: libwww-perl-6.15.tar.gz
ok 2 - Testing: libwww-perl-6.15.tar.bz2
ok 3 - Testing: libwww-perl-6.15.tgz
ok 4 - Testing: libwww-perl-6.15.tar
ok 5 - Testing: Qt4-0.99.0.tar.bz2
ok 6 - Testing: Qt4-0.99.0.tar.gz
ok 7 - Testing: Qt4-0.99.0.tgz
ok 8 - Testing: Qt4-0.99.0.tar
ok 9 - Testing: Authen-SASL-Perl-NTLM-0.003.tar.gz
| [reply] [d/l] [select] |
Re: find module name from the module archive -- CPAN namespace navigator
by Discipulus (Canon) on Dec 10, 2016 at 13:49 UTC
|
hello Lotus1
you can be interested in my craziest, controversed and unwise CUFP: CPAN Namespace Navigator
It offers you the biggest eval string here around: 13 lines of hubris!
The program downloads 02packages.details.txt and eval it offering you a console to browse all namespaces present in CPAN.
Despite of all criticism and after two years, I still find it cute, funny and coool!
L*
There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
| [reply] [d/l] [select] |
|
|
| [reply] |
Re: find module name from the module archive
by shawnhcorey (Friar) on Dec 10, 2016 at 14:49 UTC
|
Not to answer your question but to comment on your regular expression: $module =~ s/(.*)-.*/$1/; This is hard to read (and not very efficient). Try this instead:
$module =~ s/\-[^-]*$//;
This will remove everything from the last minus sign inclusively to the end of the line.
Rule of thumb: Try to make all your patterns start with a character and not a wildcard. | [reply] [d/l] [select] |
|
|
Thanks for the suggestion. I do prefer to avoid having to use $1 for just trimming some characters. My regex worked for what I needed to do and since I'm only running this on 15 to 20 strings I wasn't worried about optimization. I find that other people's regexes are usually hard to read at first. I'm curious why you say mine is inefficient. Did you do some benchmarking?
In your regex ( $module =~ s/\-[^-]*$//; ) the backslash isn't needed for a '-'.
Rule of thumb: Try to make all your patterns start with a character and not a wildcard.
I was just reviewing the regex documentation and I didn't see that one listed. There are a lot of examples however where they don't start with a character. Are you including character classes as characters?
| [reply] [d/l] [select] |
|
|
A rule of thumb would not be in the official documentation. Yes, character classes are better than wildcards.
The regex you have start with .*, which will match the entire string. It will then look for the next pattern, a minus sign. Since it's at the end of the string, it will fail. It will back up on character and look for the minus sign there. Not finding it, it will back up one more, Etc. Not very efficient.
The backslash before the minus is not needed but I tend to program defensively. A backslash before any ASCII non-letter will escape it. This is in case a new meta-character is added in the future.
You may be correct in that the regex I gave may not be more efficient in that it too has backtracking. A more efficient way would be:
$module =~ s/\-[^-]*+$//;
The extra plus sign stops backtracking. This regex would scan until a minus sign, scan non-minus-signs, and look for the end of the string. If it's not at the end, the entire pattern will fail and it will have to start over. That is, it will start the pattern from the beginning but from where it currently is in the string. It will scan the string in one pass without any backtracking.
| [reply] [d/l] [select] |