Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Should you use a module to hold configuration items?

by romandas (Pilgrim)
on Sep 18, 2010 at 19:51 UTC ( [id://860654]=perlquestion: print w/replies, xml ) Need Help??

romandas has asked for the wisdom of the Perl Monks concerning the following question:

Greetings my fellow monks,

I have recently been given the opportunity to review a software project that was written 3 or 4 years ago. The original author has since departed, but the project looks promising, and fortuitously (for me) it was written in Perl. I've been asked to look through it and document it so that we may be able to adapt it later.

The project is composed of a series of Perl scripts, using Bash scripts as wrappers to call them and pass data between them. The original author wrote several modules to handle various program tasks, all of which appear to be (to my supposed ability, such as it is) well-written though using Perl coding practices that seem older (e.g. using || die instead of or die after two-arg open statements).

What puzzles me is two modules that seem to handle configuration information, called ParseConfig and DataConfig, respectively. They do not read in a configuration file, but instead are primarily a series of package variables (with some subroutines thrown in as well) that are then used throughout all of the scripts, loaded via a use statement at the top, then referred to via their full package name (ParseConfig::somevariablename).

The two modules don't really feel like discrete entities -- almost as if they are catch-all modules instead of serving a well-defined function.

What I'm wondering is whether this is considered good Perl coding practice? I realize there are several ways to do things in Perl, but I'm trying to gauge whether this is something I would want to do or if there are better ways to do it -- and by it I mean share configuration data between several Perl scripts that do not run concurrently.

My review is not complete at this point, so take into account that I do not completely understand all the code yet, but I believe I've reviewed the majority of it.

  • Comment on Should you use a module to hold configuration items?

Replies are listed 'Best First'.
Re: Should you use a module to hold configuration items?
by BrowserUk (Patriarch) on Sep 18, 2010 at 22:49 UTC
    What I'm wondering is whether this is considered good Perl coding practice?

    I think you're asking the wrong question. You should consider asking (yourself) the following:

    1. Does it work?

      Given this is an existing project, I'm gonna assume the answer is yes.

      So then you have to ask yourself: what are the risks involved with changing it?

    2. Is it difficult to maintain?

      Reading between the lines of your description, I'm gonna assume the answer is no.

      If you need to change the value of an existing configuration parameter, you go into the .pm file and edit it.

      If you need to add a new one, you edit the .pm file and add it.

      That doesn't sound particularly hard to me. In fact, it sounds identical to the process of maintaining a configuration file in some other format--say YAML.

      But--and here is the crux of my argument--when you've made changes, validation is as simple as:

      perl -c configModule.pm

      And if the editor forgets to validate, the error will be detected immediately at program startup, as a Perl syntax error, with the clarity of Perl's syntax error messages.

      Ask yourself:

      How good are the the equivalent error messages for your prospective config file alternative?

      How good is the beta test cycle of your prospective config file format parser (say YAML), compared to that of Perl parser itself?

    Many will condemn the existing mechanism as "crude". But crude is often a euphemism for 'simple'. And there is very little wrong with simple.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Changing a module is not always trivial. Where I work it involves creating a document requesting permission for a change, a process which normally takes a week or two. Of course there are emergency requests you can process without outside discussion, to handle things that break at 2 am, but the discussion of why it was needed simply follows the change rather than precedes it. Then you have to unlock the source project, check out the file, change it, check it in, and distribute the changes to the production file system.

      As Occam said: Entia non sunt multiplicanda praeter necessitatem.

        If you have configuration files, that drive the actions and correctness of your applications, that are subject to less strict procedures than your source files, you have a gaping hole in your process.

        The important question is ... is changing a config file in a different format any easier?

        Jenda
        Enoch was right!
        Enjoy the last years of Rome.

Re: Should you use a module to hold configuration items?
by eyepopslikeamosquito (Archbishop) on Sep 19, 2010 at 01:47 UTC

    The project is composed of a series of Perl scripts, using Bash scripts as wrappers to call them and pass data between them.
    It's hard to comment on this specific use without actually seeing the bash scripts and understanding the rationale for their existence. In general though, I prefer to write it all in Perl and avoid writing bash scripts.

    What puzzles me is two modules that seem to handle configuration information, called ParseConfig and DataConfig, respectively. They do not read in a configuration file, but instead are primarily a series of package variables (with some subroutines thrown in as well) that are then used throughout all of the scripts, loaded via a use statement at the top, then referred to via their full package name (ParseConfig::somevariablename).
    I would describe using package variables like this as vulgar and amateurish. More orthodox is to use a hash, as the Perl core Config and Net::Config modules do. Making the hash readonly is desirable because changing state in globals tends to make systems hard to maintain. Allowing the user various ways to override default configuration settings is also common; for example, Net::Config uses a .libnetrc file in the user's home directory.

    Finally, in the interests of loose coupling and high cohesion, you should strive to make it clear which subset of these many configuration variables are actually used by each of your modules. A simple way to do that is to have a module take a hash of attributes in its constructor; the module itself uses those attributes only to get its work done and never peeks at the global package variables.

Re: Should you use a module to hold configuration items?
by CountZero (Bishop) on Sep 18, 2010 at 20:03 UTC
    Everything depends of course.

    Personally I would probably have used an external configuration file (I am a fan of YAML) but I can see the attraction of having your "global" variables in a separate module under their own namespace.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Should you use a module to hold configuration items?
by Your Mother (Archbishop) on Sep 18, 2010 at 22:07 UTC
    whether this is considered good Perl coding practice?

    It's bad practice and generally shows a lack of maturity in the developer which is why it seems to be pretty common. Though, as CountZero said, it depends.

    The sniff test is something like-

    • Is the code all config? Something like a DBIx::Class::Result schema file is basically pure configuration which changes infrequently, if ever, so it should be code because moving it to config would gain nothing and would complicate its use.
    • Does the config make it hard to change, repurpose, deploy, or edit code? Then it should be in config files.

      I'm not sure it's necessarily bad practice. If it's user-level configuration, there should probably be a user-level interface to change it. If it's really more a set of constants gathered together similar to a C header file that may change but only rarely and at the hands of a programmer, then I think a module is a pretty good idea.

      Package variables or constants can be very handy. Since it's older it might not have been written with the benefits of our, or the author may have wanted to be extra clear on where those values reside. IMO there's not much reason to bring in YAML, JSON, XML, Windows-style INI files, or some other non-Perl language to parse if they are just used as common constants unlikely to change.

      If the end user is not a programmer and is being expected to change the modules, that's asking for some level of frustration. That frustration could spread to many people by the end of it, too. There should be a well-documented interface from the user interface to do user-level configuration. Whether that's a very simple .rc file, command-line options, a separate configuration program, a menu in the main program, or whatever, it shouldn't be in the code if a less technical end-user is meant to change it. That's not always the case, though.

Re: Should you use a module to hold configuration items?
by shawnhcorey (Friar) on Sep 18, 2010 at 22:21 UTC

    A lot would depend on who would be changing the configuration. If only programmers, then putting it in a common module would make things easier; they would already know the syntax. If users were changing the configuration, then having a interface to validate their input would be best. Turning them loose with nothing but a text editor in your files is nothing but a recipe for disaster. But then you could have your interface rewrite the configuration files with something like Data::Dumper. Just remember that not using Perl modules for configuration means everyone has to learn yet another syntax.

Re: Should you use a module to hold configuration items?
by talexb (Chancellor) on Sep 19, 2010 at 01:34 UTC

    Best Practice? No. Unacceptable? No, as long as these modules only have configuration data, and are marked as configuration files in the packages that they come from.

    Purists would say configuration information should never live in a module; if you're writing something new from scratch, I'm a fan of YAML, but not everyone likes it. For something that already exists, configuration info in a module is fine.

    Alex / talexb / Toronto

    "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

Re: Should you use a module to hold configuration items?
by cdarke (Prior) on Sep 19, 2010 at 14:27 UTC
    It it ain't broke, don't fix it. The reason for those modules might have been that they were used by other scripts: perhaps they never got written, or they faded away, or you are not aware of them.

    But... using || die instead of or die after two-arg open statements might be broke depending on parentheses.

    You did not mention use warnings; and use strict;. A lot of older code did not bother with these, so if you are looking for bugs that might be a place to start. It might also help rooting out pesky package variables.

    When tasked with learning a system like this in the past I started by writing documentation for everything. That way your work may benefit others, and you will have something to show Management for the time you spent.
Re: Should you use a module to hold configuration items?
by sundialsvc4 (Abbot) on Sep 19, 2010 at 16:21 UTC

    Some projects do use “executable code” to hold configuration data.   (Sometimes, the modules are programmatically generated.)   My wisdom is, “if it works now, don’t change it now.”   Pick you battles carefully, like a good triage nurse at an E.R.   You might not like the way it was done, but it does not appear to be screaming in pain...

    TMTOWTDI... and only two that count:   the one that works, and the other one that also works.

      You could, cautiously, say that such a design is “elegant,” if the design that you are stuck with is that “there are a bunch of separate scripts.”   (Whether or not you think that having “there are a bunch of separate scripts” qualifies as “elegant” is beside the point.)   All of the configuration options become available with one simple use statement.   No separate parsing is required:   the .pm file is the configuration file.   If the options, once set, almost never ever change, that works.

Re: Should you use a module to hold configuration items?
by dsheroh (Monsignor) on Sep 19, 2010 at 12:27 UTC
    I'm not real crazy about the use of discrete globals rather than a single config hash, but I have no problem with putting the configuration into a Perl module for the simple reason that it allows the programmer to push the task of finding the configuration off onto perl rather than having to look for it himself.
Re: Should you use a module to hold configuration items?
by JavaFan (Canon) on Sep 20, 2010 at 14:33 UTC
    What I'm wondering is whether this is considered good Perl coding practice?
    Note that you're asking two questions. One question is "do I store configuration data in config files, or as Perl code". The second is "do I make this data available via fully qualified variable names".

    Let's answer the second question first. It's not my first choice. I prefer having a single module provides configuration settings that extend the scope of a single file, but than I rather import variables (or constant subs) into my name space. Or sometimes a hash with all the settings (like Config.pm). For large projects, I may group the settings, allowing them to be imported by tags (Fcntl.pm does this as well).

    For the first question, it depends. For some settings, it makes more sense to store them outside of the code - for instance, for settings that may be vary from machine to machine (you may have a webserver farm, not all of them connecting to the same physical database - you may want to store connection settings in a config file, whose content varies from machine to machine. Or you have a user applications, allowing each user their own settings - think the preferences file of your browser). Other settings you really do not want to be configurable in a separate file. Think for instance positions in a bitfield. Or the value of π. Or settings that are calculated, or derived from others.

Re: Should you use a module to hold configuration items?
by Anonymous Monk on Sep 21, 2010 at 16:55 UTC
    An item that nobody seems to have touched on is related to security. .pm files can contain code that will be run. The entries in YAML/.ini files won't be executed unless you decide to eval them.

    Who are the intended editors of your config file? Are they just changing values or are they authorized to change/create logic?

    Write rights to a .pm file convey more privilege than rights to YAML/.ini type files.

Re: Should you use a module to hold configuration items?
by techcode (Hermit) on Sep 21, 2010 at 12:04 UTC
    When I was starting out, I used plain global variables for those things. Which of course meant that once the code is moved to a real server (opposed to running it on my own dev machine) I needed to change those. Obviously that's rather lacking solution - so I turned in to using config files.

    I'm not sure what module from CPAN I used (tried several), used it for a time, but actually got sick of having to watch out so I don't override the config files on the production server with the ones from my development machine. And this is especially boring since if you want to use SVN/GIT - you cant have the production being work folder and just do "svn up" to update it. You have to export it somewhere else, then copy all but the configuration files.

    That also leaves a problem of what to do when you have to update the config with some new variables and such.

    So then I actually started doing exactly what you wrote - having a Config.pm file holding just the config values. Depending on the need it's either just a couple of exported globals (if it's just a one Perl file script I don't even have a module - I just set the values in the script itself), or a dummy function returning just a hash(ref) with all the config values.

    Of course the trick being to have two (or more) sets of configs and figuring out which one you want with the help of Sys::Hostname; and "hostname" function.

    That way I have all the configurations for all the locations where the code would run - all in one place - one .pm file. And I get to just "svn up" (or "svn export" if I can't afford .svn directories/files all around) on the production machine - and don't have to think about what files I shouldn't copy or such. And there is another + that in case I (dirtily but quickly) fix an error on the production server - I can just commit it back to the repository.

    I guess that could be a bad thing if you had a lot of servers - but even then you could make a module load a config file based on hostname - or throw out an error if it can't.

    O and you can also change the @INC path that way - though I try to have everything relatively placed (so "use lib './PerlLibs/;'" would work no matter what the server, or configure Perl for the "whole server", I've also run into occasions when this was needed:

    #!/usr/bin/perl use strict; use warnings; BEGIN { use Sys::Hostname; my $hostname = hostname; if($hostname eq 'devserver'){ use lib '/foo/bar'; } else { use lib '/bar/foo'; } } # ... And Some/Module.pm being found in those paths set in BEGIN block use Some::Module;

    Have you tried freelancing/outsourcing? Check out Scriptlance - I work there since 2003. For more info about Scriptlance and freelancing in general check out my home node.
Re: Should you use a module to hold configuration items?
by jakeease (Friar) on Sep 21, 2010 at 02:42 UTC

    I have worked on a similar project. An nightly cron gathers a lot of data that doesn't change frequently, from databases and corporate sites outside our server pool. The cron script rewrites a large module used by many other modules. It frees the database load to deal with more volatile data and less commonly accessed data.

    It's fast and complex, but modifiable. Not a best preactice perhaps, but it does its job well.

Re: Should you use a module to hold configuration items?
by Anonymous Monk on Sep 20, 2010 at 23:32 UTC
    Sometimes it is desired to share a single configuration source amongst multiple programmatic users. (Some of which may not be perl code). That's a potential disadvantage to keeping literals in a .pm. Can be nice for mod_perl though.
Re: Should you use a module to hold configuration items?
by sundialsvc4 (Abbot) on Oct 25, 2010 at 22:26 UTC

    One general comment I’d make, in any case, is that:   you should also consider how these configuration options will be queried and used, no matter how you store them.   I like to build a Settings.pm module (say...) which contains “the code to answer any question that this application might need to ask about|using its configuration settings.”   This module also encapsulates the process of determining what the configuration-settings are.   So, if the settings come from a database or an external file or what-have-you, this module will contain the logic to get them from that source.   Likewise, if the settings (or some of them) come from a Perl-module that is used, this module contains that use statement (and does not contain the settings statements themselves).

    My Settings.pm module is, as is usual for me, “very suspicious.”   It not only gets the settings (by whatever means), but also thoroughly examines them.   All of the clients of that module, not only know where to go for the answers that they need, but also have some reason to believe that those answers will be correct.   Most of these checks are encapsulated in an initialization-routine that is called by the application at startup time.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://860654]
Approved by bingos
Front-paged by moritz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (6)
As of 2024-04-19 10:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found