The last day or two I've been learning how to prepare a module for CPAN distribution. There doesn't seem to be a good tutorial anywhere explaining how this all works. The two tutorials I found on Perl Monks are quite old. One was written in 2002 and the other in 2005. Neither discusses newer tools like Module::Build. Nor do they explain how CPAN works, let alone discuss how this process can be safely and reliably customized.
To answer that question, one needs to know (a) what happens only on the developer's machine, (b) what happens on the target machine and (c) how CPAN itself uses the data in a distribution package.
There is no shortage of information, of course, but it all seems to be scattered here and there. I thought I'd write up my learnings and practical observations in case they would help others. Once one becomes familiar with something it is all too easy to forget what was hard.
I'd very much appreciate feedback from experienced CPAN developers. Are their practical tips I've missed? Did I form a misimpression or jump to a false conclusion about something because of my brief experience?
Many thanks in advance for feedback. -- beth
A CPAN package is a tarball that is expected to have the following contents.
A JSON or YAML file describing the package. For the specification of that file, see CPAN::Meta::Spec. The JSON file is name META.json. The YAML file is named META.yml. CPAN uses the information in this file to index the package and decide how and where to display it in CPAN.
This file contains a list of all of the files in the distribution tarball.
An optional signature file calculated using the list of files in the distribution's manifest file.
These are both Perl scripts that that customize build instructions to work on the target machine. Ideally both should be present.
The organization of the source files depends on Build.PL/make.PL. Both these scripts generate files based on some rather rigid expectations about how files are organized. For example, if Build.PL is in directory "foo", it expects all Perl source files to be in "foo/lib" and all tests to be in "foo/t".
When a distribution file is downloaded from CPAN, the installation process includes seven steps:
Steps 1&2 are handled by a CPAN client. Steps 3-6 are handled by either ./Build test or make test. The final seventh step is handled by either ./Build install or make install.
To get a sense of how to install a CPAN tarball without benefit of a CPAN client, see perlmodinstall.
Build.PLgenerates a Perl script named "Build" but only works with newer versions of the CPAN client. Makefile.PL generates a make file that can be used even with older versions of the client. However, it is less portable because it assumes that make (or some related tool like nmake/dmake) is installed.
To complete the installation using the makefile generated by Makefile.PL, CPAN runs the commands make test and make install. This means, of course, that the installation process will fail if the new machine doesn't have make installed. This is one of the reason why newer versions of CPAN use Build.PL if available. Since it is a Perl script it can run on any machine where Perl is installed. No third party software is needed. Some systems, like Microsoft Windows, do not have make installed as a matter of course.
Even on systems that do have make, make's use of the command shell can cause problems. Each operating system has a different preferred implementation of the command shell: C, Korn, Bourne, Bash, Ash, to name a few. There are subtle syntax differences between these shells and it is quite possible that a make file that works well on one flavor of Linux/Unix will fail on another because it relies on a different flavor of Linux/Unix shell.
These files are not magic. Both the Perl Build script and the make file can contain any instructions immaginable as long as they know how to understand the commands 'test' and 'install'. Thus the Perl script generated by Build.PL must be able to called like this: ./Build test and ./Build install. The generated make file must support make test and make install.
However, handcrafting the meta files (META.json, META.yml) and writing a build script/make file generator requires a great deal of domain knowledge. Most developers therefore rely on one of four main tool kits to package up their modules:
ExtUtils::MakeMaker - the oldest of these four tools generates make files. It is used to write Makefile.PL files, often with the help of h2xs, ExtUtils::ModuleMaker
Module::Build - generates a Build script. It is used to write Build.PL files. It will automatically generate a Makefile.PL file with the help of Module::Build::Compat, if requested, as well.
Module::Build expects that you will be developing your code in a project directory that looks like this:
The directories listed above should contain only the files that belong to your project. Module::Build doesn't have a good way of extracting files from a single common source tree shared by multiple projects. It assumes that all files in the lib directory belong in your project unless you specifically exclude them via a regular expression in the MANIFEST.SKIP file.
It is also essential that .pl files be placed in scripts/ and not lib/. When Module::Build sees .PL (or .pl in a case insensitive system) in the lib/ directory, it assumes that the file is meant to generate a module rather than be used as a script. It will run the script and put the output of the script into a file that has the same name as the script file, less the .PL suffix. Thus lib/foobar.pm.PL would be expected to generate lib/foobar.pm.
For portability reasons, each module name component should be 11 or fewer charaters. The first 8 of these must be different from any other module on CPAN. This ensures that the module will behave well on operating systems that have a very short file names.
The PAUSE documents recommend informative names over "cool" or poetic names. For more information, see the following links:
If you have an alternate arrangement of files, for example, storing all source code in a common tree rather than in per-project directories, you will have to move the files into place before beginning the build process. There are ways to automate this proces, but it requires subclassing Module::Build and adding an extra action, called 'makeproject' or 'import'.
Build.PL is a file you write. At a minimum it contains three basic instructions: (a) loading Module::Build or a subclass (b) initializing a new builder object with project specific property values and (c) generating a Perl script named "Build".
use 5.008008; # NOT 5.8.8 - needed by CPAN testers use Module::Build; my $builder = Module::Build ->new( module_name => 'Exception::Lite' , license => 'perl' , requires => { perl => '>= 5.8.8' } , dist_version => '0.099_001' , create_makefile_pl => 'traditional' ); $builder->create_build_script; #generate Build
You can get a full list of parameters to pass new in Module::Build::API.
The page http://wiki.cpantesters.org/wiki/CPANAuthorNotes has some helpful pointers for making it easier for CPAN testers to work with your distribution. The key points related to Build.PL and Makefile.PL are:
The dist_version property identifies the version number for your distribution package. All distributions MUST have a version number.
If you omit the dist_version property number, Perl will try to guess the version number by looking for a variable named $VERSION in the 'module_name' module. For the example above, had 'dist_version' been omitted, Module::Build would have looked for $VERSION in 'lib/Exception/Lite.pm'
The version number is an especially important parameter because CPAN uses it to track distribution files. It consists of three components: a major number indicating a collection of binary compatible releases; a 3 digit minor version number indicating feature enhancements within that binary compatible group, and a 3 digit patch or development release number.
If the third component is preceded by '_', CPAN counts the upload as a development release. The intended features for the minor version may be partially implemented as well. Thus '0.099_001' would be the first development release for feature set '0.999'. It is meant to be available for testing but not as a published download.
This intention is enforced softly. The CPAN distribution page marks it with a label in big red letters saying "DEVELOPEMENT RELEASE". CPAN clients are encourged not to install it as the default version even if its version number is higher than any others. They should be downloaded only if the user requests that specific version, presumably for testing purposes.
If the patch number is preceded by a '.' then it will be published and available for downloading via CPAN. For more information, see Perl::Version.
No two uploads may have exactly the same version number. If you mess up and need to reupload a distribution file, you must change the patch or development release number.
Unfortunately, there don't seem to be many options to control this process. For HTML generation there is only one user definable option: html_css: my $oBuilder = Module::Build->new (....); $oBuilder->html_css('MyLayout.css');
See Pod:Html and Module::Build::API for more information about setting css.
Another related issue concerns the content of pod files. The syntax and handling of the L<link_descriptor> has changed over time. Two changes in particular may cause problems:
The script generated by this simple file contains a number of default commands. In addition to the test and install commands, there are several that are generally used only by developers preparing their code for packaging.
For a list of commands, see Module::Build
You can also have much more elaborate scripts for generating Build.PL. This one subclasses Module::Build on the fly and adds a routine that imports project files from a single codebase source tree. The routine is very simple and would benefit from many improvements (portable path name construction, checking for deleted files, validating the copy). It is meant only for illustration purposes:
use 5.008008; # NOT 5.8.8 my $sClass = Module::Build->subclass(code => <<'EOF'); my $MODULE_BASE = 'Exception/Lite'; my @LIB_SOURCES = ('.pm', '.pod', '.t'); sub ACTION_makeproject { my $oBuilder = Module::Build->current(); my $sProjectRoot=$oBuilder->args('srctree'); if (!defined($sProjectRoot)) { warn "No source tree root defined\n"; return; } $sProjectRoot .= '/' unless ($sProjectRoot =~ m{/$}); my $sModuleSrc = $sProjectRoot . $MODULE_BASE; require File::Copy; if (! -d 'lib') { mkdir 'lib' or die $!; } if (! -d 'lib/Exception') { mkdir 'lib/Exception' or die $! } my $sModuleLib = 'lib/' . $MODULE_BASE; print STDERR "Making a project <$MODULE_BASE> from <$sProjectRoot>\n" . "Copying files to lib ... "; foreach my $sSrc (@LIB_SOURCES) { File::Copy::copy($sModuleSrc.$sSrc, $sModuleLib.$sSrc); } print STDERR "lib/ is built\n"; } EOF my $builder = $sClass ->new( # command line options to hard-code data needed by # makeproject action defined above get_options => {srctree => { type => '=s' }} , module_name => $MODULE_NAME , license => 'perl' , requires => { perl => '>= 5.8.8' } , test_files => [ $MODULE_ROOT.'.t'] , dist_version => '0.099_001' , create_makefile_pl => 'traditional' ); # called on command line like this # perl Build.PL --srctree='/X/Y/Z/'; # makeproject command run like this # ./Build makeproject
Building a subclass with Module::Build->subclass(code=>...) is only practical for very short snippets of code. Code defined via the code property is compiled without benefit of strict or warnings so it is especially easy for variable name mispellings to slip through. Also syntax highlighting doesn't necessarily work in here documents (on Xemacs it all gets colored as a string) so the probability of mistakes is increased even further.
If you do choose to use Module::Build->subclass(code=>...), everything you plan to use must be placed within the here document assigned to the code property. The Build.PL file and code that is part of it is never used after Build.PL runs. In fact the code snippet that you define in the here document is simply used to generate a subclass definition file that is placed in the _build directory. Anything outside of that snippet will never make it into the generated subclass file. That is why you cannot do something like this in your Build.PL file:
{ package MyBuilder; our @ISA=qw(Module::Build); ... my code here ... } my $builder = MyBuilder->new(...various arguments...);
If you need to define extensive amounts of code you are better off defining your specialist code in a dedicated subclass file and placing that file in the inc directory of your project directory. See Module::Build::Authoring for more information.
As a developer there are two reasons you will want to run the Build.PL command. First, the generated Build file defines many commands that are useful to developers. Second, you will want to test your installation process and generating Build from Build.PL is part of that installation process.
To generate Build you simply type perl Build.PL in the top level of the project directory.
The Build.PL command must be run from the top level of the project directory. The script generation routines in Module::Build simply assumes that "lib/", "inc/", etc are in the current directory where the script was launched. It will complain about not being able to find modules if run from any other directory.
If you want to generate both the build script and the makefile your Build.PL file can set the create_makefile_pl property in the parameter list to Module::Build->new(...).
Setting this parameter is the easiest way to generate a makefile and it will work for most simple installations. However, if your installation process is complex, you may need to take more control over this process. For details, see Module::Build::Compat and Module::Build::API's documentation on the create_makefile_pl parameter.
Running Build.PL adds two items to the top level of the project directory:
You can completely remove the Build script and the _build directory, by running the command ./Build realclean. The name of this action is a bit of a misnomer. It always removes the build script and the _build/ directory. It sometimes removes the blib/ directory, the distribution staging area, and temporary files produced during the html generation process. What determines when things are removed and when they are not is not at all clear.
It appears to never remove the following files:
If you want to regenerate these from scratch, you must manually remove them.
To package your module you must run the following commands in sequence:
./Build manifest ./Build disttest ./Build dist
The build script generated by Build.PL does not accept more than one action at a time so you can't combine the commands into one single action, such as "./Build manifest disttest dist". Only the first command will be run.
Please note, if you decide that certain files are no longer needed by your project and you remove them from the project directory, the manifest action will not remove them from the manifest file. It will merely warn you about the missing files. You must delete them from the manifest file manually. Alternatively, you can manually delete the file and regenerate MANIFEST from scratch. Also note, the realclean action does not remove the MANIFEST or MANIFEST.SKIP files. If you want to regenerate them from scratch you must remove them manually.
This method will complain if it can't find a MANIFEST file so you must run the "manifest" action before running this action. It will not run it automatically for you.
The staging directory name is just the module name with each :: replaced by '-' and '-version' tacked onto the end. Thus Exception::Lite gets a distribution directory named "Exception-Lite-0.099_001".
During the creation process, the directory will be removed and in its place you will see a tarball. Thus the directory Exception-Lite-0.099_001 is replaced by the tarball Exception-Lite-0.999_001.tar.gz
The disttest routine only verifies that the module has the files needed to upload the module to CPAN, download it and run its tests. To make sure your module installs properly you will need to run additional tests. Additional testing may also be required to make sure that the released code fits your quality control standards.
To emulate what happens after the tarball is unpacked, you can run the following sets of commands:
./Build test ./Build fakeinstall -or- perl Build.PL --destdir /tmp/foo/ ./Build test ./Build install
The first set of commands builds blib/ as normal, tests the files and generates documentation as normal. However, instead of copying the files to their final destination it merely reports on what files it would have copied and to which locations.
The second set of commands does an actual fake installation to a directory other than the normal site directory. In this case the files are installed to /tmp/foo. You can verify this by running ./Build fakeinstall. Instead of the normal site locations, the copy destinations will all be in /tmp/foo/.
Please note that the second method requires rebuilding the Build script. The destination directory is hard coded into the script and there is no option for changing the destination directory on the build script itself.
To clean out generated files and start all over you can use. In theory this should clean out the blib/ directory generated by the 'test' action. It is best to double check that the file was in fact removed. For some reason, from time to time, the "blib/" directory won't go away even when this command is run.
./Build clean
There is very limited support for this. If you want to test the generatio of documentation that would not normally be generated on your system you can use the following two commands:
Note: the 'html' action complains about being unable to resolve links to documentation pages and modules that only have a top level name (example: the documentation pages for UNIVERSAL, Exception generate exceptions even though these can be found on CPAN and have man pages visible via the perldoc command.
You can control the locations where files will be installed by using the --install_path and --installdirs options. See Module::Build for details.
However, this only begins to touch on the portability issues that can affect a module. By far and away the best option is to get your module working well on your own system and then upload it to CPAN where users of other systems can download and test it. See CPAN Author Notes for more information.
Module::Build's generated Build script also contains several tools for checking the quality of code, tests, and documentation. Among them:
If you are particular excited about quality metrics you might also want to consider using the Module::Build::Kwalitee subclass of Module::Build. For a description of the Kwalitee metrics and why they are important, see http://cpants.perl.org/kwalitee.html. Kwalitee metrics are tracked by CPANTS, an alternate testing service that should not be confused with CPAN testers.
Module::Build was designed for subclassing and fortunately many developers have taken advantage of that and shared their work.
A number of extensions to Module::Build have been created to handle special application types: applications with embedded C/C++, applications with databases, applications with a web front end and so on. For a list of available modules, search CPAN.
To upload a module to CPAN, you need an account on PAUSE. For more information, see About Pause
The Build script generated by Module::Build also supports packaging for software distribution channels other than CPAN:
Updates:
2010-12-29, 7:11am IST: moved section on extensions to Module::Build into the section on building modules with Module::Build - I plan to add a top level section on Dist::Zilla recommended by several below so this doesn't make sense as a top level section.
2010-12-29, 12:30pm IST: added subsection numbers to section 1; replace "the CPAN client" with "a CPAN client" (there is more than one); removed the word "inherently" from the phrase "inherently less portable" in section 1.3 (Build.PL vs. Makefile.PM); Fixed wording in section on packaging tools (1.4) and added mention of Module::Install and Dist::Zilla.
2010-12-29, 3:00pm IST: updating discussion of development releases to include moritz's comments on development release below.
2010-12-29, 5:15pm IST: added links to perlmodinstall in 1.3 (what a CPAN client does), as an example of installing modules without the benefit of a client and another link to a document that explains the Qwalitee metrics mentioned in the section on additional testing.
|
---|