I got started thinking about this by an Anonymous Monk at using the Mailer / sendmail perl module... and by jettero's reply. This seems to be a broader issue than just a couple of modules, as there are a lot of modules that overlap at least partially in what they do.

It seems there are plenty of criteria to balance. Speed, clarity, reliability, documentation, examples, tutorials, opportunities for outside help, compatibility/platform concerns, frequency of module updates, core/non-core, portability, similarity in calling style to the rest of a project, and conceptual integrity are surely an incomplete list.

In the specific example of SMTP mail sending modules using the SMTP protocol or piping through something like sendmail, there are several. I tend to use Net::SMTP or Net::SMTP::TLS for my outgoing mail, and I have no problems. Mail::Mailer might be a perfectly good module, but I, personally, have never seen much code using it nor much documentation on it outside what comes with it. Other modules seem to get many more mentions I haven't done a comprehensive survey nor gathered many statistics, but I do have some basic comparisons using basic tools for three mentioned in the inspiring thread.

 Net::SMTPMail::MailerMIME::Lite
# results Google's Code Search 1 700020003000
search of perlmonks.org using Google shows 5671991040
regular Google search for "<Namespace>/<Module>.pm" 2 306017602230
Ratings on 3 52.54.5
Has a Reviews entry on Perlmonks? nonoyes
core module? 4 yesnono

1 Yes, Google reports these overly round numbers for its code search feature
2 because colons in phrases and Google are apparently not on speaking terms
3 The voting samples are entirely too small for this to mean too much.
4 Checked against 5.8.8 and 5.9.5

A big difference in mentions could make a difference in getting up to speed with a module quickly, but it doesn't seem like Mail::Mailer is exactly hurting for examples or discussion. The other options just have even more.

I've always been able to kind of muddle through picking a module and weighing these things in a kind of ad-hoc fashion in my own mind. I often download more than one module and try them briefly. Sometimes I still end up porting a project from one module to another after some growth or specification changes.

I'm still left wondering if there's a better or more accepted way to determine which module to use up front. Choice is good, but is there some set of criteria people weigh and compare that can give someone a bit more confidence in their choices? Experience is good, but is there a good way to pass experience with module choices on to others?

The reviews and ratings systems would appear to help, but they aren't as complete as they could be. They could also overlook a specific detail about a project that calls for a less popular module that supports a certain project's needs better. Does the best advice really come down to reading the full docs of each module in question and asking the community which is best for a given task?

Is there a resource out there that has a checklist comparison for the major modules in any given application area? If there is, we could use it as a model to do so for more areas. That way when modules focus on different parts of a task, a programmer could see more quickly which module might meet the specific needs of a project. Some generic data could be core/non-core, stable/non, OO/procedural interface, depends on other modules?, depends on other applications?, actively developed?, uses XS?, and maybe more. Then, features important to the application area could be noted.

Are any of these ideas worthwhile, or should I just stick to muddling through ad-hoc comparisons?

Replies are listed 'Best First'.
Re: How does one choose among modules?
by dragonchild (Archbishop) on Oct 08, 2007 at 19:38 UTC
    Pick the one your best friend uses. If it doesn't break, you're good. If it does, try another one. Don't fret - the best is the enemy of the good.

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?

      "Pick the one your best friend uses."

      That's some of the best advice on this site I think.

      -Paul

Re: How does one choose among modules?
by GrandFather (Saint) on Oct 08, 2007 at 19:56 UTC

    It often comes down to hammers - you use the tool you are familiar with and with which you can get the job done. Sometimes that means you need to change the shape of your nail a little. ;)

    If I'm looking for a module in a new domain it seems most often to be resolved in one of three ways:

    1. I search CPAN and there is one clearly most suitable module
    2. I search CPAN and there is are a bundle of modules that may do the job so I install the most likely looking ones and try them
    3. I search CPAN and don't find anything that looks right so I ask in the Chat Box

    Once I've found my hammer, then I use it to bang all sorts of likely looking nails.

    If I don't find a module at all then sometimes I write one. .oO(I ought polish up some of those modules a little and put them on CPAN.)


    Perl is environmentally friendly - it saves trees
Re: How does one choose among modules?
by graff (Chancellor) on Oct 09, 2007 at 05:19 UTC
    Counting mentions is okay, I guess, but there can be cases where something is just "famously bad", so that basing a decision on statistics alone would be unfortunate.

    For the programmer facing the need to solve a problem, I don't think there's much we can expect in terms of avoiding an ad-hoc approach, but when it comes to breaking into a territory where numerous CPAN modules have already gone before, there is a progression of steps that should lead (more or less directly) to a good choice:

    1. Compare documentation for readability: if a module's description of usage is hard to understand, you'll probably have trouble using it; if the explanations are clear and well organized with suitable attention to detail, there's a good chance the code was written the same way. **

    2. Compare documentation for relevance to your task: among the modules whose descriptions you find readable, you'll see different ideas about how to approach the underlying problems, different ways of organizing resources, even different assumptions about the goals and constraints of the caller's application. Sometimes you'll pick a module because it fits well with what you're doing, and sometimes you'll rethink what you're doing to fit a module that looks good to you.

    3. Compare documentation for ... (what can we call it?) "dedication" or "gravitas" or "grown-upness" or ... : check for things like depth of revision history, references to published standards or RFC's, and maybe just some subjective "gestalt" sense about whether the module author really "gets it".

    4. Once you have a ranked list based on reading the docs, the next hurdle is ease of installation. Usually this isn't really a discriminator -- the majority of modules install easily -- but if your first choice turns out to depend on half a dozen other modules that you didn't think would be relevant (and that you haven't thought to review yet), and especially if one or more of those modules happen to fail "make test", you might consider moving on to the next candidate.

    But let's face it -- a lot of us would just rather go to PerlMonks and post a SoPW node or mention a question in the CB, and there's nothing wrong with that, except maybe that we still end up (as often as not) with a lot of alternatives to choose from, and we're back to step 1.

    ** update: My apologies about that first point... it may have come across as being "Anglocentric" or prejudicial against excellent programmers who have better things to do than beat their heads against English grammar. I want to be clear that I'm talking about how specifications are outlined and how sample code is shown -- not about correct spelling, verb tenses, plurals or even sentence word order. If the docs are hard to follow because of limited English skills, it's worth looking at the code to see whether it's just a problem with English, and not a problem with logic.

    For better or worse, Perl is a language dominated by English, but putting aside its function names and reserved words, its flexible syntax should seem fairly "familiar" to speakers of any human language, given an adequate amount of formal education.

    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: How does one choose among modules?
by bart (Canon) on Oct 09, 2007 at 11:11 UTC
    You left one thing out: recommendations. Just barge in, in the Chatterbox and ask which module you should be using. Many people must have been already there at the point where you are now, so you can build on their experiences.

    And you know you can trust our judgments. ;-)

      To follow up on bart's suggestion -- go check out the IRC channel for the modules competing for your attention.

      I was interested in checking out lightweight web servers recently, and I found the lighttpd channel occupied by over a hundred users, while Cherokee had just half a dozen users. Having said that, the one user I chatted with in the Cherokee channel was friendly and quite knowledgeable.

      That probably shouldn't be a deciding factor, but it's good to know, in case you need somewhere to go for support.

      Alex / talexb / Toronto

      "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

Re: How does one choose among modules?
by clinton (Priest) on Oct 09, 2007 at 13:16 UTC
    This is a problem that has irked me for a long time, and I have (somewhere in the back of my head) a plan to do something about it... when I get the time ;)

    I'm thinking about a meta-CPAN website, with modules classified by functionality, and with all of the data available about each module aggregated into one place (eg CPAN ratings, release frequency, kwalitee, DrHyde's dependency trees)

    So maybe the ability to compare a number of modules based on their lists of stated functionality

    The classic example is the myriad date/time modules - while the DateTime collection is considered the gold standard, there are a number of other modules that are more light-weight and suited for particular purposes. Or the XML modules. Different modules for different purposes.

    Maybe also an interface of "If I want to do X, which modules should I use, and why?"

    The problem (other than getting the site built in the first place), is that, in order for it to be useful, it needs a lot of user input. CPAN ratings are useful (if a bit non-specific), but there just aren't that many of them.

    Does anybody have any thoughts about this? Something you like? Dislike? What would you like to see?

    I may even get around to writing it sometime.

    Clint

      I've thought about this sort of thing too. The problem is the only definitive information about a module is it's POD (and code). Unless you try it, or you're very good at reading code, the only way to select a module is by what it's POD claims to do. Those claims may be incorrect, poorly communicated, or mis-understood. Additionally, it may be hard to find a group of comparable modules.

      Various people have tried to solve this problem, with the likes of CPAN Ratings, AnnoCPAN, reviews like those on this site, etc. The problem is, it's easy for someone to completely miss all those. They may just do a quick search in the CPAN shell, browse some PODs, and install one or two that look reasonable. Maybe they don't install properly on their system, maybe they don't work as advertised, maybe they don't quite meet the current needs. All these things can waste a lot of time - our most valuable development resource.

      I think ratings, reviews, categories, etc. are definitely moving in the right direction. But I think they need to be centralised in some way. The obvious answer is to build these things into CPAN. Then they're available no matter how you access it

      OK, that may be a lot of work, and maybe even impractical, but to me, that's the "ideal" solution to this problem.

        Agreed - it should all be in CPAN. The only reasons I suggest building a new site are:
        • to test out the idea without bringing CPAN down
        • easier to write new code with full access rather than having limited access to the code running CPAN (and I have no idea what state that is in)
        • can experiment with ideas without annoying CPAN users with a changing interface

        If the site proved it's worth, I would expect it to be integrated into CPAN, or at least for CPAN to provide a link to the module's page on this site.

        The problem is the only definitive information about a module is it's POD (and code). Unless you try it, or you're very good at reading code, the only way to select a module is by what it's POD claims to do. Those claims may be incorrect, poorly communicated, or mis-understood.
        I wouldn't even attempt to extract this information manually. I'm thinking more of wiki-style access (though maybe less free-form). I'm brain-dumping here, but how about a tree of "features", eg:
        Date/Time |_ understand time zones |_ converts from RTF 1234 format
        So the process would be:
        • User finds module, wants to add Feature X
        • Clicks : Edit feature list
        • available features contains Feature X
          • Y: select Feature X
          • N: Add new feature
        • Adds a rating 1-5 for that feature
        Another user can:
        • Select 5 modules to compare
        • Site displays a table listing the union of all mentioned features of all the compared modules
        • Modules missing a ranking for a feature displays [unknown] instead
        • User has the ability to rank the [unknown]s
        A feature-finder could be implemented from the same data, so you can add required or optional features from the feature tree, and the returned module list would be filtered (and ranked) to show those that support said features.

        As I said, just a brain dump, but may have value. Any more ideas?

        Clint

Re: How does one choose among modules?
by frostman (Beadle) on Oct 09, 2007 at 19:02 UTC

    I've dealt with the same problem many times, and for what it's worth my main criteria are:

    1. Does it have a reasonable API?

      Can I subclass it? Can I clone it? Can I mock it with T:M:O? Will my eyes burn when I call its constructor?

    2. Is its documentation coherent?

      Incoherent or missing documentation often points to bad engineering habits, and you don't want to poison your codebase unnecessarily.

    3. Does it pass all its tests and install happily in my several different environments?

      You may only need to deploy it under Ubuntu Marbly Muskrat now, but what if you become an expert in this only to find that it doesn't work on your new client's XServe? (Some things are worth patching and so on, but it's still a big fat demerit.)

    4. Does it have a clean CPANTS profile?

      For the above reason, but with more variety.

    5. Is it actively maintained?

      This isn't universally important, but for a module that does something in a changing environment (e.g. parsing PDFs) or in an infinitely complex problem area (e.g. object-relational mapping) it means a lot if people are working to make the module better. Extra points for fast bug turnaround and for multiple people working together on it; the former means your patches will probably be used, the latter means the code has a better chance of being readable (should the need arise).

    6. Is it in core?

      Even if it's slightly worse, I like the convenience of a core module. But if it's a lot worse, forget it and make the first test in your suite do a bunch of use_ok()'s.

    7. Do people hate it?

      This is completely subjective, but still: ask your local guru or list whether anyone specifically advises against it, and why. There are some widely-used modules that have fundamental design flaws that will really hurt you if you need to grow something big on top of them.

    Google searches (especially on PerlMonks) factor in if there's something tricky about the module, in which case I'm mostly concerned with how people explain it and advocate for/against it. I've never factored in Google code searches before, but it's an interesting idea and I might do that the next time the problem comes up.

      As for actively maintained modules, you're right. If all the bugs are out and the module has the complete feature set intended by the authors, there's no reason to rate the module down for not being updated recently.

      The rest of your points I'll not mention specifically because I have no disagreement nor comments to expand on your reasoning.

      What would you think of a web page with a table listing ratings for all of these things for all of, say, the PDF tool sets? I'm not sure that's the ultimate solution that I'm looking for, but it's one of the ideas that comes to mind. Of course, one might want empirical measures present separately from subjective ones or given more weight, but those are implementation details.

        For things that are (nearly) objective I think it would be cool to have such a chart, and not just for "competing" modules. Say, everything to do with PDF:
        1. CPANTS pass/fail rate
        2. Most recent version
        3. Rate of new version releases
        4. Pod::Coverage result (or similar)
        5. Perl::Critic result (violation by severity?
        6. ...and so on...

        All nicely sortable or course. But excluding anything you can't back up objectively, so no "five stars from me" type of stuff.

        In fact it might be really cool to do that for all of CPAN and let you sort/filter/etc on it.... but I'm not volunteering, too busy myself these days.
      A reply falls below the community's threshold of quality. You may see it by logging in.
Re: How does one choose among modules?
by DrHyde (Prior) on Oct 09, 2007 at 10:04 UTC

    Surely the best way to choose is to read their manpages and see which one fits your needs best. Net::SMTP and MIME::Lite are as about as different as you can get! Only once you've narrowed the choice down to more than one module which fits your needs well should you bother taking anything else into account.

      Actually, MIME::Lite uses, or can use anyway, Net::SMTP, so they're not that different after all. MIME::Lite just MIME encodes the message before sending, which could be done by other modules. MIME::Lite's job could be done by lots of different MIME modules in conjunction with a mail sending module. That's not really my question. I just used Net::SMTP, MIME::Lite, and Mail::Mailer as examples because those were the ones being discussed in the referenced thread.

      My question isn't which module to choose. It's about the process of choosing modules. Notice this was a Meditation and not a SoPW. That's on purpose. It's kind of a "meta" thing. It's a question about asking and answering questions about modules. Primarily, it's a question about how to answer more clearly and sooner in the module selection process.

      I'm not at all sure that reading the complete documentation for every module that could possibly fit is the best way to get started. That's the whole point of my question. Is there a better, less time-consuming way?

      Furthermore, is there some way that won't make CPAN so intimidating to new Perl programmers? Sure, most of us are used to the amount of choices, but do you ever notice people being turned off on Perl early because TIMTOWTDI runs a bit rampant? I've seen lots of frustration when people want to know how to do something -- anything, forget mail sending for a second -- and the response is more or less, "read the docs for these 5 modules and come back with questions about the one you choose".

      Indeed, I think people who have been programming with Perl for several years don't generally take the sheer number of overlapping modules on CPAN into account. Certainly not the same way newcomers do. If you've been programming with Perl for a while, you tend to know which modules might fit and which ones are complete wastes of disk space. You recognize at least some of the names of module authors. A quicker glance at the docs or even at a CPAN search results page does much more for you because of your experience. For tasks you've already done, you probably already have experience with one or more modules and have your favorites for many classes of tasks. We'll never eliminate the advantages of experience, but we shouldn't take for granted that everyone will have it, either.

      As for narrowing down the choices to the ones that fit someone's needs, that's exactly what I'm asking. I'm not asking for advice on which module to use. I'm asking how we, as a community, improve the experience of that process. Do we continue to just recommend modules based on lots of research by the people asking and interaction with them? Can we make a more accessible resource for people than "Ask the experts over there"?

        Yes, MIME::Lite can use Net::SMTP. It can also pipe to sendmail. The point is that using MIME::Lite makes the mechanics of SMTP irrelevant, whereas you must understand those mechanics to use Net::SMTP directly. That makes them very different. The difference is akin to that between using mutt to read and write your mail, and using telnet to ports 110 and 25 and just talking POP and SMTP yourself like a Real Programmer would.

        To answer your meditation - if you want to figure out which of those modules to use you *must* have some understanding of what you're trying to do, and what the modules do. The best way to do that is to consult something more knowledgeable than yourself. That something is generally the modules' documentation.

        The answer you're probably not looking for to the question of "how on earth do I even find out which modules I should consider to do task FOO" is "ask other people".

      A reply falls below the community's threshold of quality. You may see it by logging in.
Re: How does one choose among modules?
by UnstoppableDrew (Sexton) on Oct 09, 2007 at 13:27 UTC
    A lot of the tools I write in Perl are for developers to use, or that run autonomously on build servers. One of my big criteria is to use core modules whenever possible, so I don't have to worry about whether the next machine it gets run on will have the necessary module. Sure, often you can install it, but not always. If it's part of the standard distribution, I know I can count on it being there.
A reply falls below the community's threshold of quality. You may see it by logging in.