jacques has asked for the wisdom of the Perl Monks concerning the following question:

I've got a Perl module that has a bunch of subroutines that I want to cut out and put into a separate file. Then the main module can use those routines from that file. Simple enough.

And here's what I think is a simple question: should I place that "helper" file in the same directory as the module? Does it matter? I have noticed on CPAN that many people put additional files in a directory structure similar to this:

Some/Module.pm
Some/Module/Helper.pm

Notice that the directory has the same name as the pm file.

Should I even name the "helper" file with .pm at the end? All it holds are those routines that the main pm file uses. I will probably require the file in the module.

Replies are listed 'Best First'.
Re: Separate files for CPAN dist
by Joost (Canon) on Jan 06, 2005 at 20:30 UTC
    The eventual location of the file is only interesting because it is coupled with namespaces in the use and require statements:
    use Some::Module; # searches for Some/Module.pm in @INC
    The reason to put "sub-modules" in directories named after the "main" module is that it doesn't pollute other namespaces see my CPAN module for instance - all modules in that distribution are located (when installed) in $some_INC_path/Audio/LADSPA directory, except for the main Audio/LADSPA.pm module. This means that using the module shouldn't have any effect on other modules in the Audio namespace, or outside of it, and can't accidentally overwrite other files from other modules.

    Using namespaces "outside" of your distribution's main module's namespace is probably not a good idea, as it makes it difficult for other CPAN authors to check if that namespace is already taken.

    update:

    Should I even name the "helper" file with .pm at the end? All it holds are those routines that the main pm file uses. I will probably require the file in the module.
    I would use a .pm file, because it's easiest to use or require, and I would really recommend to put it in a subroutinedirectory named after the first module, for reasons stated above.

Re: Separate files for CPAN dist
by xdg (Monsignor) on Jan 06, 2005 at 22:03 UTC

    My answer is "it depends". Are the subroutines tightly coupled to your main module or are they more general utility functions that might be useful elsewhere? If the subroutines are valuable/reuseable in their own right, then it might make sense to release them either in a more logical namespace, or even in a separate distribution.

    For example, consider Mail::Box. This comes in the distribution "Mail-Box", but the distribution contains modules Mail::Box and Mail::Message (and a host of others). Just because they're part of the same distribution doesn't mean they need to be in a hierarchical namespace if another namespace would make more sense.

    As another example, in the course of writing a simulation module, I found myself needing a consistent random number user interface. Rather than creating them as "My::Sim::Random::Uniform", etc., I factored out the random number code into a standalone module, Math::Random::OO, and released it separately. My simulation code (not yet released), will just list that module as a prerequisite. That keeps both modules distinct and reusable by others in a more granular way than if they were one monolithic distribution

    So, if your helper is just breaking up your code into more manageable chunks for you, then make it a module in a subdirectory of your distribution. If your module is tightly coupled to your code, but its purpose would be clearer with a separate name, then keep it in your distribution with a more directory structure. If your module implements distinct functionality on its own that your module needs, but that others might find themselves needing, too, then release it as a separate module and just include it as a prerequisite in your main module.

    -xdg

    Code posted by xdg on PerlMonks is public domain. It has no warranties, express or implied. Posted code may not have been tested. Use at your own risk.

Re: Separate files for CPAN dist
by brian_d_foy (Abbot) on Jan 07, 2005 at 06:28 UTC

    I've never been particularly keen on a lot of directories in the lib/ directory of my distributions. I know things like h2xs and other module tools want you to think that's a good idea, but I've found it much easier to just put everything in a flat directory.

    lib/Module.pm lib/Helper.pm

    I can easily grep the sources and do other multi-file sorts of things, but more importantly for me, I get to see all of the files in one place. Other people have their preferences. If you do everything right, no one is ever going to know: they'll install the module and get rid of the distribution and be none the wiser. :)

    I can install them wherever I like by telling MakeMaker where to put them. My Test::Data breaks out the functions into separate modules, although you get to them through Data.pm. You can use whatever you like for the keys and values of the PM argument to WriteMakefile().

    WriteMakefile( #... 'PM' => { 'lib/Data.pm' => '$(INST_LIBDIR)/Data.pm', 'lib/Scalar.pm' => '$(INST_LIBDIR)/Data/Scalar.pm', 'lib/Array.pm' => '$(INST_LIBDIR)/Data/Array.pm', 'lib/Hash.pm' => '$(INST_LIBDIR)/Data/Hash.pm', 'lib/Function.pm' => '$(INST_LIBDIR)/Data/Function.pm', }, #... );

    Beyond that, it doesn't really matter.

    --
    brian d foy <bdfoy@cpan.org>
Re: Separate files for CPAN dist
by mkirank (Chaplain) on Jan 07, 2005 at 08:01 UTC
    This is of great importance when you are running your scripts under mod_perl
    Example - You have files

    Some/Module.pm
    SomeOther/Module.pm

    you have 2 perl files that use them (use Module.pm <= note just the file name not directory )
    a.pl and b.pl
    when u test this by running in a single server mode (httpd -X) you will notice that only the first one will run and the second gives an error
    This is because the modules didnt declare the package name and there is some sort of namespace pollution
    apparantly the %INC hash has already the key called called Module.pm and dosent load the second Module
    This is explained in the practical mod_perl book Here
    Update
    Please check brian_d_foy's answer to this, It is explained clearly .I had got this wrong

      You don't use a filename with use(), and it's not just flat namespaces that have the problem.

      The problem is a combination of the inner workings of %INC and @INC. If @INC has a relative path in it, and it finds the requested module in the current working directory, it loads it. That could be a name like ./Module.pm or nested path like ./Foo/Bar/Baz/Module.pm. Later in the script (or system like mod_perl), if we try the same thing from a different directory where there is the same relative path but to a different module, %INC thinks it already loaded it. It's the path that matters, not the module file name. %INC just stores the path name. A relative path can represent multiple files, but an absolute path points to one file.

      This doesn't bite people with regular scripts because they don't have to worry about persistence. This bites people under mod_perl because a lot of scripts share %INC, and somebody may have already loaded a module. It only bites them if they use relative paths in @INC. If they don't do that, they should get what they expect. Modules like FindBin can help a script figure out where it is, although I prefer to simply install modules in known locations.

      Still, this doesn't have anything to do with the layout of a distribution or what its files should be named, at least no more than it does in any other situation.

      --
      brian d foy <bdfoy@cpan.org>
        Thanks for showing me the Light .. really appreciate the same :-)
Re: Separate files for CPAN dist
by jacques (Priest) on Jan 07, 2005 at 04:16 UTC
    Thank you both for your keen insight. I now know what I am going to do with it.