Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

The last day or two I've been learning how to prepare a module for CPAN distribution. There doesn't seem to be a good tutorial anywhere explaining how this all works. The two tutorials I found on Perl Monks are quite old. One was written in 2002 and the other in 2005. Neither discusses newer tools like Module::Build. Nor do they explain how CPAN works, let alone discuss how this process can be safely and reliably customized.

To answer that question, one needs to know (a) what happens only on the developer's machine, (b) what happens on the target machine and (c) how CPAN itself uses the data in a distribution package.

There is no shortage of information, of course, but it all seems to be scattered here and there. I thought I'd write up my learnings and practical observations in case they would help others. Once one becomes familiar with something it is all too easy to forget what was hard.

I'd very much appreciate feedback from experienced CPAN developers. Are their practical tips I've missed? Did I form a misimpression or jump to a false conclusion about something because of my brief experience?

Many thanks in advance for feedback. -- beth

1. CPAN Distribution - Technical Overview

1.1 Contents of a CPAN package

A CPAN package is a tarball that is expected to have the following contents.

META.json or META.yml

A JSON or YAML file describing the package. For the specification of that file, see CPAN::Meta::Spec. The JSON file is name META.json. The YAML file is named META.yml. CPAN uses the information in this file to index the package and decide how and where to display it in CPAN.

MANIFEST

This file contains a list of all of the files in the distribution tarball.

SIGNATURE

An optional signature file calculated using the list of files in the distribution's manifest file.

Build.PL and/or Makefile.PL

These are both Perl scripts that that customize build instructions to work on the target machine. Ideally both should be present.

the actual files listed in MANIFEST

The organization of the source files depends on Build.PL/make.PL. Both these scripts generate files based on some rather rigid expectations about how files are organized. For example, if Build.PL is in directory "foo", it expects all Perl source files to be in "foo/lib" and all tests to be in "foo/t".

1.2 Downloading and installing using a CPAN client

When a distribution file is downloaded from CPAN, the installation process includes seven steps:

  1. Unpack the tarball into a directory
  2. Generate the Build script and/or makefile by running Build.PL or Makefile.PL. The choice depends on the version of CPAN installed on the target machine. Older versions of CPAN only know how to work with Makefile.PL. If the version knows how to use Build.PL it will use that. Otherwise it will use Makefile.PL.
  3. Generate source code files using scripts marked for that purpose.
  4. Generate documentation files. Either man pages, html, or both will be generated depending on the target systems Config.pm file. The Config.pm file is part of a Perl installations configuration. On Debian it is found in /usr/lib/perl/5.N/Config.pm
  5. Place all source code files, raw and generated, into a staging area (by convention, called "./blib/")
  6. Run tests
  7. Copy files to their final locations and perform custom installation actions.

Steps 1&2 are handled by a CPAN client. Steps 3-6 are handled by either ./Build test or make test. The final seventh step is handled by either ./Build install or make install.

To get a sense of how to install a CPAN tarball without benefit of a CPAN client, see perlmodinstall.

1.3 Build.PL vs. Makefile.PL

Build.PLgenerates a Perl script named "Build" but only works with newer versions of the CPAN client. Makefile.PL generates a make file that can be used even with older versions of the client. However, it is less portable because it assumes that make (or some related tool like nmake/dmake) is installed.

To complete the installation using the makefile generated by Makefile.PL, CPAN runs the commands make test and make install. This means, of course, that the installation process will fail if the new machine doesn't have make installed. This is one of the reason why newer versions of CPAN use Build.PL if available. Since it is a Perl script it can run on any machine where Perl is installed. No third party software is needed. Some systems, like Microsoft Windows, do not have make installed as a matter of course.

Even on systems that do have make, make's use of the command shell can cause problems. Each operating system has a different preferred implementation of the command shell: C, Korn, Bourne, Bash, Ash, to name a few. There are subtle syntax differences between these shells and it is quite possible that a make file that works well on one flavor of Linux/Unix will fail on another because it relies on a different flavor of Linux/Unix shell.

1.4 Choosing packaging tools

These files are not magic. Both the Perl Build script and the make file can contain any instructions immaginable as long as they know how to understand the commands 'test' and 'install'. Thus the Perl script generated by Build.PL must be able to called like this: ./Build test and ./Build install. The generated make file must support make test and make install.

However, handcrafting the meta files (META.json, META.yml) and writing a build script/make file generator requires a great deal of domain knowledge. Most developers therefore rely on one of four main tool kits to package up their modules:

2. Building a module with Module::Build

2.1 Arranging your files

Module::Build expects that you will be developing your code in a project directory that looks like this:

Build.PL
instructions for generating the Build command. This is a file you write. See below for details.
script/
stores your Perl scripts, i.e. your .pl files
lib/
stores your Perl modules, i.e. your .pm files
t/
stores your test scripts, i.e. your .t files
test.pl
A script responsible for running all tests. If missing, the tests in t/ will be run via TAP::Harness or Test::Harness depending on how you configure the Build.PL and Makefile.PL files. If present, the test.pl determines how to run the tests and in what order. Build test will run test.pl instead of trying to run the tests in t/ on its own.
inc/
supplemental files used by your packaging and installation process. They will be included in the tarball, but the meta data file will be set up so that they will be ignored by CPAN's indexing mechanism. For a practical use of this directory, see Module::Build::Cookbook's discussion of how to bundle Module::Build with your package.
MANIFEST.SKIP
a set of regular expressions matching files and directories to ignore in /lib, /script, /t and /inc. These files will be excluded from the tarball even though they are within the project directory.

The directories listed above should contain only the files that belong to your project. Module::Build doesn't have a good way of extracting files from a single common source tree shared by multiple projects. It assumes that all files in the lib directory belong in your project unless you specifically exclude them via a regular expression in the MANIFEST.SKIP file.

It is also essential that .pl files be placed in scripts/ and not lib/. When Module::Build sees .PL (or .pl in a case insensitive system) in the lib/ directory, it assumes that the file is meant to generate a module rather than be used as a script. It will run the script and put the output of the script into a file that has the same name as the script file, less the .PL suffix. Thus lib/foobar.pm.PL would be expected to generate lib/foobar.pm.

Module names

For portability reasons, each module name component should be 11 or fewer charaters. The first 8 of these must be different from any other module on CPAN. This ensures that the module will behave well on operating systems that have a very short file names.

The PAUSE documents recommend informative names over "cool" or poetic names. For more information, see the following links:

If you use an alternate organization for your projects

If you have an alternate arrangement of files, for example, storing all source code in a common tree rather than in per-project directories, you will have to move the files into place before beginning the build process. There are ways to automate this proces, but it requires subclassing Module::Build and adding an extra action, called 'makeproject' or 'import'.

2.2 Writing Build.PL

Build.PL is a file you write. At a minimum it contains three basic instructions: (a) loading Module::Build or a subclass (b) initializing a new builder object with project specific property values and (c) generating a Perl script named "Build".

use 5.008008; # NOT 5.8.8 - needed by CPAN testers use Module::Build; my $builder = Module::Build ->new( module_name => 'Exception::Lite' , license => 'perl' , requires => { perl => '>= 5.8.8' } , dist_version => '0.099_001' , create_makefile_pl => 'traditional' ); $builder->create_build_script; #generate Build

You can get a full list of parameters to pass new in Module::Build::API.

Making your Build.PL file CPAN testers friendly

The page http://wiki.cpantesters.org/wiki/CPANAuthorNotes has some helpful pointers for making it easier for CPAN testers to work with your distribution. The key points related to Build.PL and Makefile.PL are:

  • put use VERSION at the top of your Build.PL file. Although the constructor for Module::Build allows one to specify a required version of Perl, older versions of the CPAN client don't know how to read this and may try to test packages not designed for them.
  • When you specify the version number for use VERSION, use the old style version format M.mmmppp where mmm is a 3 digit 0 padded placeholder for the minor version and ppp is a three digit 0 padded placeholder for the patch/development release number. Thus 5.008008 rather than 5.8.8.
  • If your system only supports a specific set of operating systems, the Build.PL script should begin with code that dies with one of the following messages "No support for OS" or "OS unsupported". The CPAN testing tools know to look for this messsage and will consider the platform not applicable for any distribution that generates this message.
  • If you need threads your tests should be configured so that those tests are skipped if threads are not installed.
Distribution version numbers

The dist_version property identifies the version number for your distribution package. All distributions MUST have a version number.

If you omit the dist_version property number, Perl will try to guess the version number by looking for a variable named $VERSION in the 'module_name' module. For the example above, had 'dist_version' been omitted, Module::Build would have looked for $VERSION in 'lib/Exception/Lite.pm'

The version number is an especially important parameter because CPAN uses it to track distribution files. It consists of three components: a major number indicating a collection of binary compatible releases; a 3 digit minor version number indicating feature enhancements within that binary compatible group, and a 3 digit patch or development release number.

If the third component is preceded by '_', CPAN counts the upload as a development release. The intended features for the minor version may be partially implemented as well. Thus '0.099_001' would be the first development release for feature set '0.999'. It is meant to be available for testing but not as a published download.

This intention is enforced softly. The CPAN distribution page marks it with a label in big red letters saying "DEVELOPEMENT RELEASE". CPAN clients are encourged not to install it as the default version even if its version number is higher than any others. They should be downloaded only if the user requests that specific version, presumably for testing purposes.

If the patch number is preceded by a '.' then it will be published and available for downloading via CPAN. For more information, see Perl::Version.

No two uploads may have exactly the same version number. If you mess up and need to reupload a distribution file, you must change the patch or development release number.

Configuring documentation generation

Unfortunately, there don't seem to be many options to control this process. For HTML generation there is only one user definable option: html_css: my $oBuilder = Module::Build->new (....); $oBuilder->html_css('MyLayout.css');

See Pod:Html and Module::Build::API for more information about setting css.

Another related issue concerns the content of pod files. The syntax and handling of the L<link_descriptor> has changed over time. Two changes in particular may cause problems:

  • Links without text fields: Some older generators assumed that any non url style link without explicitly specified text were man pages. Instead of rendering the link text literally, they would substitute L<foo> with "the foo man page" or the foo documentation". As tedious as it may be, if your distribution is meant to work on older Perl installations, you may prefer to explicitly provide text for each link. In otherwards, your pod should use L<Module::Foo|Module::Foo> rather than just plain L<Module::Foo>.
  • URL style links cannot have link text prior to Perl 12.0. You cannot do L<foo|html://example.com/foo.html> but rather must do L<html://example.com/foo.html> without the link text.
Output of Build.PL

The script generated by this simple file contains a number of default commands. In addition to the test and install commands, there are several that are generally used only by developers preparing their code for packaging.

For a list of commands, see Module::Build

Advanced Build.PL files

You can also have much more elaborate scripts for generating Build.PL. This one subclasses Module::Build on the fly and adds a routine that imports project files from a single codebase source tree. The routine is very simple and would benefit from many improvements (portable path name construction, checking for deleted files, validating the copy). It is meant only for illustration purposes:

use 5.008008; # NOT 5.8.8 my $sClass = Module::Build->subclass(code => <<'EOF'); my $MODULE_BASE = 'Exception/Lite'; my @LIB_SOURCES = ('.pm', '.pod', '.t'); sub ACTION_makeproject { my $oBuilder = Module::Build->current(); my $sProjectRoot=$oBuilder->args('srctree'); if (!defined($sProjectRoot)) { warn "No source tree root defined\n"; return; } $sProjectRoot .= '/' unless ($sProjectRoot =~ m{/$}); my $sModuleSrc = $sProjectRoot . $MODULE_BASE; require File::Copy; if (! -d 'lib') { mkdir 'lib' or die $!; } if (! -d 'lib/Exception') { mkdir 'lib/Exception' or die $! } my $sModuleLib = 'lib/' . $MODULE_BASE; print STDERR "Making a project <$MODULE_BASE> from <$sProjectRoot>\n" . "Copying files to lib ... "; foreach my $sSrc (@LIB_SOURCES) { File::Copy::copy($sModuleSrc.$sSrc, $sModuleLib.$sSrc); } print STDERR "lib/ is built\n"; } EOF my $builder = $sClass ->new( # command line options to hard-code data needed by # makeproject action defined above get_options => {srctree => { type => '=s' }} , module_name => $MODULE_NAME , license => 'perl' , requires => { perl => '>= 5.8.8' } , test_files => [ $MODULE_ROOT.'.t'] , dist_version => '0.099_001' , create_makefile_pl => 'traditional' ); # called on command line like this # perl Build.PL --srctree='/X/Y/Z/'; # makeproject command run like this # ./Build makeproject

Building a subclass with Module::Build->subclass(code=>...) is only practical for very short snippets of code. Code defined via the code property is compiled without benefit of strict or warnings so it is especially easy for variable name mispellings to slip through. Also syntax highlighting doesn't necessarily work in here documents (on Xemacs it all gets colored as a string) so the probability of mistakes is increased even further.

If you do choose to use Module::Build->subclass(code=>...), everything you plan to use must be placed within the here document assigned to the code property. The Build.PL file and code that is part of it is never used after Build.PL runs. In fact the code snippet that you define in the here document is simply used to generate a subclass definition file that is placed in the _build directory. Anything outside of that snippet will never make it into the generated subclass file. That is why you cannot do something like this in your Build.PL file:

{ package MyBuilder; our @ISA=qw(Module::Build); ... my code here ... } my $builder = MyBuilder->new(...various arguments...);

If you need to define extensive amounts of code you are better off defining your specialist code in a dedicated subclass file and placing that file in the inc directory of your project directory. See Module::Build::Authoring for more information.

2.3 Running the Build.PL Command

As a developer there are two reasons you will want to run the Build.PL command. First, the generated Build file defines many commands that are useful to developers. Second, you will want to test your installation process and generating Build from Build.PL is part of that installation process.

To generate Build you simply type perl Build.PL in the top level of the project directory.

The Build.PL command must be run from the top level of the project directory. The script generation routines in Module::Build simply assumes that "lib/", "inc/", etc are in the current directory where the script was launched. It will complain about not being able to find modules if run from any other directory.

Generating both a build script and makefile

If you want to generate both the build script and the makefile your Build.PL file can set the create_makefile_pl property in the parameter list to Module::Build->new(...).

Setting this parameter is the easiest way to generate a makefile and it will work for most simple installations. However, if your installation process is complex, you may need to take more control over this process. For details, see Module::Build::Compat and Module::Build::API's documentation on the create_makefile_pl parameter.

Deleting the generated script and starting over

Running Build.PL adds two items to the top level of the project directory:

Build
A script defining commands for use by developers and CPAN's automated installation process. This file will be regenerated each time Build.PL is run.
_build
Data files used by the Build script.

You can completely remove the Build script and the _build directory, by running the command ./Build realclean. The name of this action is a bit of a misnomer. It always removes the build script and the _build/ directory. It sometimes removes the blib/ directory, the distribution staging area, and temporary files produced during the html generation process. What determines when things are removed and when they are not is not at all clear.

It appears to never remove the following files:

  • META.yml
  • MANIFEST
  • MANIFEST.SKIP
  • Makefile.PL
  • tarballs generated by the dist action

If you want to regenerate these from scratch, you must manually remove them.

2.4 Packaging up your module for distribution

To package your module you must run the following commands in sequence:

./Build manifest ./Build disttest ./Build dist

The build script generated by Build.PL does not accept more than one action at a time so you can't combine the commands into one single action, such as "./Build manifest disttest dist". Only the first command will be run.

manifest
generates the MANIFEST file and creates a MANIFEST.SKIP file if that is missing. If the MANIFEST files exists already, it will update it.

Please note, if you decide that certain files are no longer needed by your project and you remove them from the project directory, the manifest action will not remove them from the manifest file. It will merely warn you about the missing files. You must delete them from the manifest file manually. Alternatively, you can manually delete the file and regenerate MANIFEST from scratch. Also note, the realclean action does not remove the MANIFEST or MANIFEST.SKIP files. If you want to regenerate them from scratch you must remove them manually.

disttest
collects all the files that will be placed in the tarball into a staging directory. If there is no META.yml file, it will generate it and copy it to the staging area. Then it verifies that Build.PL can be run, followed by Build test. It does not install anything.

This method will complain if it can't find a MANIFEST file so you must run the "manifest" action before running this action. It will not run it automatically for you.

The staging directory name is just the module name with each :: replaced by '-' and '-version' tacked onto the end. Thus Exception::Lite gets a distribution directory named "Exception-Lite-0.099_001".

dist
converts the staging area directory into a tarball. If the sign property is set when calling ModuleBuild->new and your system has Module::Signature installed, the tarball will also be signed and the results stored in a file called SIGNATURE.

During the creation process, the directory will be removed and in its place you will see a tarball. Thus the directory Exception-Lite-0.099_001 is replaced by the tarball Exception-Lite-0.999_001.tar.gz

2.5 Additional testing options

The disttest routine only verifies that the module has the files needed to upload the module to CPAN, download it and run its tests. To make sure your module installs properly you will need to run additional tests. Additional testing may also be required to make sure that the released code fits your quality control standards.

Emulating what happens after the tarball is unpacked

To emulate what happens after the tarball is unpacked, you can run the following sets of commands:

./Build test ./Build fakeinstall -or- perl Build.PL --destdir /tmp/foo/ ./Build test ./Build install

The first set of commands builds blib/ as normal, tests the files and generates documentation as normal. However, instead of copying the files to their final destination it merely reports on what files it would have copied and to which locations.

The second set of commands does an actual fake installation to a directory other than the normal site directory. In this case the files are installed to /tmp/foo. You can verify this by running ./Build fakeinstall. Instead of the normal site locations, the copy destinations will all be in /tmp/foo/.

Please note that the second method requires rebuilding the Build script. The destination directory is hard coded into the script and there is no option for changing the destination directory on the build script itself.

To clean out generated files and start all over you can use. In theory this should clean out the blib/ directory generated by the 'test' action. It is best to double check that the file was in fact removed. For some reason, from time to time, the "blib/" directory won't go away even when this command is run.

./Build clean
Testing installation on systems other than your own.

There is very limited support for this. If you want to test the generatio of documentation that would not normally be generated on your system you can use the following two commands:

html
Generates the html documentation from pod files printing out error messages about unresolvable links and other difficulties. This is guarenteed to generate html even if the current system does not normally request it.

Note: the 'html' action complains about being unable to resolve links to documentation pages and modules that only have a top level name (example: the documentation pages for UNIVERSAL, Exception generate exceptions even though these can be found on CPAN and have man pages visible via the perldoc command.

manpages
Generates the man page documentation from pod files even if the current system is not configured to request it. This action is available version 0.28 and up only - earlier versions relied on the fact that nearly all systems were configured to request man pages.

You can control the locations where files will be installed by using the --install_path and --installdirs options. See Module::Build for details.

However, this only begins to touch on the portability issues that can affect a module. By far and away the best option is to get your module working well on your own system and then upload it to CPAN where users of other systems can download and test it. See CPAN Author Notes for more information.

Quality control testing

Module::Build's generated Build script also contains several tools for checking the quality of code, tests, and documentation. Among them:

skipcheck
prints out the files that were omitted based on rules in MANIFEST.SKIP. You can use this to eyeball the list of excluded files and verify that nothing was unintentionally excluded from your distribution by a malformed regex.
testpod
finds all of the pod files and makes sure that they are well formed.
testcover
runs the test action using Devel::Cover and generates a code coverage report. To use this, Devel::Cover must be downloaded from CPAN. It isn't part of the Perl core.

If you are particular excited about quality metrics you might also want to consider using the Module::Build::Kwalitee subclass of Module::Build. For a description of the Kwalitee metrics and why they are important, see http://cpants.perl.org/kwalitee.html. Kwalitee metrics are tracked by CPANTS, an alternate testing service that should not be confused with CPAN testers.

2.6 Extensions to Module::Build

Module::Build was designed for subclassing and fortunately many developers have taken advantage of that and shared their work.

A number of extensions to Module::Build have been created to handle special application types: applications with embedded C/C++, applications with databases, applications with a web front end and so on. For a list of available modules, search CPAN.

3. Uploading your package to CPAN

To upload a module to CPAN, you need an account on PAUSE. For more information, see About Pause

4. Alternative distribution channels

The Build script generated by Module::Build also supports packaging for software distribution channels other than CPAN:

  • PAR files are the Perl analog to Java's JAR files. The bundle Perl scrpts with a loader and perl interpreter. See the 'pardist' action in Module::Build, the PAR wiki and Features list from 2005 for more information.
  • PPM is the package management system for Active state Perl. For tools to create packages distributed via PPM, see the 'ppmdist' and 'ppd' actions.
  • Support for distributing modules as.deb packages for Debian Linux is available through Module::Build::Debian.

Updates:

  1. 2010-12-29, 7:11am IST: moved section on extensions to Module::Build into the section on building modules with Module::Build - I plan to add a top level section on Dist::Zilla recommended by several below so this doesn't make sense as a top level section.

  2. 2010-12-29, 12:30pm IST: added subsection numbers to section 1; replace "the CPAN client" with "a CPAN client" (there is more than one); removed the word "inherently" from the phrase "inherently less portable" in section 1.3 (Build.PL vs. Makefile.PM); Fixed wording in section on packaging tools (1.4) and added mention of Module::Install and Dist::Zilla.

  3. 2010-12-29, 3:00pm IST: updating discussion of development releases to include moritz's comments on development release below.

  4. 2010-12-29, 5:15pm IST: added links to perlmodinstall in 1.3 (what a CPAN client does), as an example of installing modules without the benefit of a client and another link to a document that explains the Qwalitee metrics mentioned in the section on additional testing.


In reply to RFC: How to Release Modules on CPAN in 2011 by ELISHEVA

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (4)
As of 2024-03-29 13:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found