somekindafreak has asked for the wisdom of the Perl Monks concerning the following question:

This may be the easiest question of the month, but I can't find an answer anywhere. I have a really large module called ABC.pm. Since many people work on this module simultaneously and since the code in the module can be neatly organized into sections I have ABC.pm with
require "ABC/stage1.pl"; require "ABC/stage2.pl"; require "ABC/stage3.pl";
and so on. However each library file has a header of package ABC, not package ABC::stage# since I bless an object to the ABC class and use it for all the stages.

This works fine, however it fails Perl::Critic Modules::RequireFilenameMatchesPackage and Modules::RequireBarewordIncludes checks.

Is there an acceptable method to divide the code of one large module across several files without strictly following the filename-strictly-matches-the-packagename standards? I don't mind using use or require or any filename extension. Or should I just not worry about these warnings? thanks

Replies are listed 'Best First'.
Re: Acceptable way to divide a long module into multiple files
by Corion (Patriarch) on Jan 11, 2011 at 21:06 UTC

    If a standard does not work for you, why don't you just disable it?

    Also, you could cobble together the class ABC by importing subroutines from ABC::stage1, ABC::stage2 and ABC::stage3 into it:

    package ABC; use ABC::stage1 ':all'; use ABC::stage2 ':all'; use ABC::stage3 ':all'; 1;

    ... or you could use a version control system that has good merging, like for example Git.

    I recommend the last option.

Re: Acceptable way to divide a long module into multiple files
by eyepopslikeamosquito (Archbishop) on Jan 12, 2011 at 07:01 UTC

    Looks like you've got yourself a Bloater code smell:

    Bloater smells represents something that has grown so large that it cannot be effectively handled.

    From Code Smells (codinghorror.com):

    Large classes, like long methods, are difficult to read, understand, and troubleshoot. Does the class contain too many responsibilities? Can the large class be restructured or broken into smaller classes?

Re: Acceptable way to divide a long module into multiple files
by ELISHEVA (Prior) on Jan 12, 2011 at 06:57 UTC
    Or should I just not worry about these warnings?

    That rule exists primarily to make sure that departures from the convention are well thought out, not because Perl will somehow malfunction if package and file name do not match. Very often mismatches are due to (a) typos in pakage or file names or (b) possibly a beginning programmer valuing "I feel comfortable with this name" over "I want to easily find the file that stores my class definition".

    Neither (a) nor (b) apply in this case. Your decision to split the module among several files is an intentional choice, not a typo or failure to understand the long term maintenance significance. But you still might want to think twice about splitting the module into multiple parts.

    The other reason for package=file name is that in general splitting code between multiple files is not a good idea, even if it looks that way during the green field development phase. Long term, the code tends to be harder to maintain. Follow-on programmers may not realize that there is more than one file storing the class definition unless the multi-file nature of the class definition is clearly documented.

    A second issue is that over time dependencies between methods of a class sometimes change. A split class definition needs to be carefully arranged so that each part relies only on the earlier parts in the sequence. If something in the future causes part 2 to need something in part 3, you will have to rearrange the division of code into files. This rearranging would be completely unnecessary were the code not divided among multiple files. Furthermore, rearranging that spans multiple files is always more error prone and more complex to revert than rearranging code in a single file.

    Should you decide that the pros of splitting the module definition outweigh the risks, it is important is that your codebase handle these situations in a consistent and well documented manner. Preferably you should use a naming convention that makes it obvious that there are "continuation" files for a module. For example, you might reserve "name_N" for part N of a module "name".

    Personally, I prefer the header+parts approach. The header file is named after the class itself and simply loads the parts. It also contains a line reminding the maintainer that the class definition is stored in multiple files. When other projects, use this module, they only know about the header file. The header file is responsible for loading all of the parts:

    Sample header file, "My/Mondo/Class.pm"

    use warnings; use strict; package My::Mondo::Class; # This class is defined in multiple files - see use My::Mondo::Class_1; use My::Mondo::Class_2; use My::Mondo::Class_3;

    The files storing the parts are defined as normal module files, with the ".pm" ending and store the actual class definition. Each of the My::Mondo::Class_N would be stored in a file named, "My/Mondo/Class_N.pm" and begin like this:

    use strict; use warnings; package My::Mondo::Class; # methods and variable initializations that depend only # on stuff declared in My::Mondo::Class or earlier parts # in the sequence, i.e. if this is My::Mondo::Class_2 then # it only references variables and methods in # My::Mondo::Class and My::Mondo::Class_1

    Also be aware that using variables across many files requires a nuanced understanding of the difference between lexical and global variables. See replies below. This may present training issues.

    Also be aware that at least some versions of Perl are a touchy about package variables. If you split a module between two files, say "Foo_1.pm" and "Foo_2.pm", and you define a variable in "Foo_1.pm", then you have to use the fully qualified name in "Foo_2.pm", even if the current package is the same as "Foo_1.pm". This applies to both my and our variables. This issue does not apply to subroutines. They can be used without qualification in all files assigned to the "Foo" package. - (struck out due to tilly's correction below)

    #Foo.pm use strict; use warnings; package Foo; use Foo_1.pm; use Foo_2.pm; # Foo_1.pm use strict; use warnings; package Foo; my $HELLO; our $GOODBYE; sub hello { return "Hi!"; } # Foo_2.pm use strict; use warnings; package Foo; #note: also package Foo #$HELLO='Bonjour' #compiler complains - see below for why $Foo::HELLO='Bonjour'; #this is OK - see below for why #$GOODBYE='Au revoir' #compiler complains - see below for why $Foo::GOODBYE='Au revoir'; #this is OK - see below for why # calling hello() without qualification is OK, so long # as the package is Foo (which it is) print hello() . ": Hello=$Foo::HELLO, Goodbye=$Foo::GOODBYE\n";

    Note: is this a bug or intended behavior? Behavior noted in Perl 5.8.8) - Update:: definitely intended behavior - see below.

      Also be aware that at least some versions of Perl are a touchy about package variables. If you split a module between two files, say "Foo_1.pm" and "Foo_2.pm", and you define a variable in "Foo_1.pm", then you have to use the fully qualified name in "Foo_2.pm", even if the current package is the same as "Foo_1.pm". This applies to both my and our variables. This issue does not apply to subroutines. They can be used without qualification in all files assigned to the "Foo" package.

      Sorry, but no.

      In your example try having hello() print out the value of $HELLO. You will find that Foo_2.pm did not alter it. Alternately try using our in Foo_2.pm and see that you can access the right $GOODBYE. And you really are accessing it without the fully qualified package name.

      Here are the relevant facts:

      • Perl has set of package tables where variables can be stored. You can access these using the fully qualified package name (eg $Foo::HELLO), or you can access the ones in the current package if they have been declared with vars, imported with Exporter, or if you're not using strict.
      • Perl has an independent way to store lexical variables. These are declared with my, and do not exist in the package system. You cannot access them from outside the lexical scope where it is declared. (Well, unless you want to engage in some internals wizardry.)
      • Perl has a weird hybrid declared with our. This gives lexically scoped access to a package global.
      • strict.pm requires that variables be properly declared, or fully package scoped.
      So here is what happened. In Foo_1.pm you declared a lexical variable $HELLO with my. And you lexically scoped access to $Foo::GOODBYE with our. In Foo_2.pm you were out of the lexical scope of the declarations in Foo_1.pm, so you silenced any possible warning by using the fully qualified package name. This gave you access to a variable named $HELLO, but the wrong one. And you got access to the expected $GOODBYE.

      All of this is documented behavior.

        Thanks, tilly. I think what confused me was that I've been viewing the package as a lexical scope that could span files. Now I see that it is only a namespace and that the largest lexical scope in Perl is a file. Do I have that right?

        To illustrate what (I think) tilly is saying (because I find it somewhat abstract):

        #--------------------------------------------------------- # Foo_1.pm use strict; use warnings; package Foo; my $HELLO='elloHay'; our $GOODBYE ='oodbyeGay'; sub hello { "hi! $HELLO -> $GOODBYE"; } 1; #--------------------------------------------------------- # Foo_2.pm use strict; use warnings; package Foo; # You are right Tilly - "our" doesn't reset $GOODBYE # and strict won't complain about an unqualified our # variable so long as we declare it our $GOODBYE; print "GOODBYE=$GOODBYE\n"; # no compiler complaints from above line # outputs: GOODBYE=oodbyeGay $Foo::HELLO='Bonjour'; #this is OK $Foo::GOODBYE='Au revoir'; #this is OK print hello() . "Hello=$Foo::HELLO, Goodbye=$Foo::GOODBYE\n"; # in the hello() sub HELLO and GOODBYE are both unqualified. # the my variable(HELLO) retains its old value, but the our # variable (GOODBYE) reflects the new value because with # or without qualification it is global (in the package table) #outputs: hi! elloHay -> Au revoir: Hello=Bonjour Goodbye=Au revoir #NOT hi! elloHay -> oodbyeGay: Hello=Bonjour Goodbye=Au revoir #NOR hi! Bonjour -> Au revoir: Hello=Bonjour Goodbye=Au revoir

        So, no bug.