LesleyB has asked for the wisdom of the Perl Monks concerning the following question:

Hello

I hope some of you wiser monks might help with this.

I want to develop a contact page where the user submits the email address as part of the form. I wish to validate the email address. So I popped over to CPAN and searched using the phrase 'email address validation' and got six results. Which is not so many to work with to be honest.

But, as with all multiple choices in unfamiliar territory, I am always faced with the problem of which is the most relaible?

I looked at CGI::Untaint::email, thinking to myself 'well I know some CGI...' to find this module is at v0.03 and hasn't been altered since 2001. However it is useful because I didn't know about CGI::Untaint and now I do.

Email::Valid seems more active but hasn't been updated since 2006. It also gets votes here in email address validation regexp although Benchmarking email address validation methods adds a few more modules to the pot that didn't appear in the search results on cpan.org

I will probably use Email::Valid because the comments in the ratings section indicate it is easy to use, but I have been faced with the problem of choosing the 'right' module before and am curious to know how people select modules to use.

I currently use version numbering, activity, and ratings to help me choose. What about you?

Replies are listed 'Best First'.
Re: A Beginners Question on CPAN
by toolic (Bishop) on Jul 07, 2008 at 15:30 UTC
Re: A Beginners Question on CPAN
by kyle (Abbot) on Jul 07, 2008 at 16:14 UTC

      Kyle

      Thank you for posting that link. It is very useful and cites the problem perfectly.

      I searched for CPAN here and got the CPAN node and tried 'start CPAN' only to find mostly writeups on the mechanics of using CPAN rather than advice on the decision process of which module to use.

      I don't think I would have found that node for quite a while.

        I found it by going to Super Search and entering 'CPAN' for title text, then scrolled down to almost the bottom to click "Don't include Replies". and clicked "Search". That got me a list of root nodes with "CPAN" in the title. At that point, I just scanned down the list until I saw what I was looking for. I had the advantage of knowing it was there and the disadvantage of forgetting how long ago it was (I had to look back farther than I thought).

Re: A Beginners Question on CPAN
by moritz (Cardinal) on Jul 07, 2008 at 15:35 UTC
    currently use version numbering, activity, and ratings to help me choose. What about you

    Additionally: Open Tickets in RT, and interface.

    It's always a good sign if I know how to use the most important functions of a module after reading the first five to ten lines of the synopsis.

    For example I looked into Data::Validate::Email, and the Synopsis is very convincing to me.

    Other good signs are comparisons to other modules in the POD ("there are 15 other modules for exactly this job. This is why I wrote the 16th")

    I can't give you advise on this particular decision, though.

Re: A Beginners Question on CPAN
by ww (Archbishop) on Jul 07, 2008 at 20:59 UTC

    Many good observations above!

    Now, one that has not yet been made in this thread:

    You can "validate" the form of an email address using a module; to actually "validate" the address itself, you have to check whether it exists.

    This is why you'll find some/many sites with a registration option that handle the process like this:

    User fills in and submits the form

    Site sends a "please confirm" mail to the address supplied (with or without some sort of confirmation string and a reply-to: address used specifically for the purpose).

    If the recipient responds (sometimes, within a specified length of time), site processes the reply and then (and only then) adds the new user to the list of valid emails/members/users.

Re: A Beginners Question on CPAN
by hangon (Deacon) on Jul 08, 2008 at 04:29 UTC

    For what it's worth, here's the process that I use for selecting modules:

    • Write a list of requirements roughly in order of importance.
    • Draw a line between the "must have" items and the ones I could live without.
    • Develop a list of candidate modules via methods discussed by others in this thread.
    • Read the the Description and Synopsis for each candidate and briefly scan the rest of the POD. Modules that are poorly documented or clearly do not meet the "must have" requirements are removed from consideration.
      The Description section will often tell you all you need to know about the documentation. If it's a well articulated description of the module's intended use and capability, the rest of documentation is usually well written. If it's terse, unclear or verbose you can generally expect more of the same. The synopsis should clearly show what the API looks like.
    • From here on the selection process varies, but I have a requirements list to guide me and a reasonably well documented selection of modules to choose from.
Re: A Beginners Question on CPAN
by duckyd (Hermit) on Jul 07, 2008 at 16:32 UTC
    I often do look at the date of last release as part of my criteria for choosing which module to use, but in this case it's probably not very useful.

    What constitutes a valid email address hasn't changed in a long time, so an old module that did the job well might be perfectly usable today.

Re: A Beginners Question on CPAN
by Lawliet (Curate) on Jul 07, 2008 at 15:25 UTC

    I go by popularity. I find that's the best way. (Even if the module came out a month ago, if it is very popular, it cannot be too bad).

    Oh, and please do respond with the level of satisfaction using the module you end up using.

    <(^.^-<) <(-^.^<) <(-^.^-)> (>^.^-)> (>-^.^)>
      I go by popularity. I find that's the best way. (Even if the module came out a month ago, if it is very popular, it cannot be too bad).

      Don't go judging something by popularity, or you'll end up preferring PHP over Perl ;-)

      I prefer one well informed and well phrased opinion any time over ten people who just tell me "it's great".

      I use Linux, although Windows is more popular. I use fvwm2 as my window manager, although others are far more popular. I'm helping the Perl 6 developers, although Python 3000 (or whatever its name is) gets much more positive press. I prefer pen & paper RPGs over MMORPG, although the latter are more popular.

      I don't care if something is important unless I know why it's popular, and know that the same reasons might appeal to me.

        Oh lawdy lawd, you are right! I guess I should have said, well, what you said. :P

        Oh, and depsite me being oh-so-judging, if I do not the popular thing, I'll switch. So hah.

        And for the record, I prefer Perl, use linux, and do not even like mmorpg's. ;)

        <(^.^-<) <(-^.^<) <(-^.^-)> (>^.^-)> (>-^.^)>
Re: A Beginners Question on CPAN
by Anonymous Monk on Jul 07, 2008 at 17:40 UTC
    I currently use version numbering, activity, and ratings to help me choose. What about you?

    The first thing I look for is good documentation. FWICT, poor docs often portend other inadequacies.

Re: A Beginners Question on CPAN
by trwww (Priest) on Jul 07, 2008 at 23:38 UTC

    I have been faced with the problem of choosing the 'right' module before and am curious to know how people select modules to use.

    1. Search CPAN
    2. narrow my selection down by reading the synopsis
    3. and then depending on how big the library is I:
      • read the module's source code
      • unpack the distro and add my own .t files to the test directory
    If the feature the module offers is critical then I do both items for #3.

    trwww

Re: A Beginners Question on CPAN
by cutlass2006 (Pilgrim) on Jul 08, 2008 at 06:44 UTC

    I find cpan ratings is becoming more useful e.g.

    Data-FormValidator

    though when faced with no differentation I tend to install and briefly try out all of the approaches ... this is pretty time consuming; its a function of how important the feature is.

    My top tip is to have a peek around the t directory, run the tests, investigate the test code .... more often then not there is a correlation of the perceived quality of tests and me choosing the module.

    Lastly, I am pragmatic with modules that have a large number of dependencies ... there are lots of good ones out there, but these need more investigation before I adopt them into my toolbox.

Re: A Beginners Question on CPAN
by DrHyde (Prior) on Jul 09, 2008 at 10:20 UTC

    You really can't judge a module by its version number or by how often it gets updated. What does it tell you that Data::Compare is now at version 1.19? Nothing. If you look at the release history on search.cpan.org you'll see that it went straight from 0.17 to 1.18. If you look at the changes in that release, you'll see that there were only a few very minor changes, far less significant than those between 0.05 and 0.06. You'll also see that it spent two years as version 0.13, two and a bit years at 0.02, and two years at 0.01.

    And if those two year gaps weren't enough to convince you that activity isn't necessarily a sign of a good module, look at Statistics::SerialCorrelation (not updated for 4.5 years) or Tie::Scalar::Decay (not updated for 7 years).

    The best things to look at IMO are:

    • other programmers' recommendations, so ask your local perl mongers
    • CPAN-testers results
    • portability
    • dependencies
    • the rather confusing advanced search on rt.cpan.org, which *may* let you see how quickly reported bugs get fixed. Although that won't include bugs reported directly to the author by email.

    Some of those - especially portability and dependencies - are somewhat subjective, of course. It's my opinion, for example, that a module that works on lots of different platforms is higher quality than one that does the same job but only on a few platforms - I think it shows either that the author has better coding practices, or that he cares more. And a module with lots of dependencies, while it does demonstrate admirable code re-use and avoids reinventing wheels badly, could also be harder to install and is subject to bugs that may be introduced later in its dependencies especially if those dependencies have poor test coverage.