I'm thinking of writing a new module to do what Data::FormValidator and Data::Validator::Item and a few other modules have done.

The question I'm sure you're already asking is, "Why? It's already been done!"

My reply is, "Yes, it's already been done, in fact it's already been done at least 7 times"

"Then why are you doing it again?"

"Because I think I can do it better."

"Well that's a self-rightious attitude. Why could you do it better?"

"Because I'm going to do it differently."

"Is differently better?"

"No, but this time I think it might be... see for yourself:"

Sample of code using new data validation package:
use CGI; use Data::Validate::OO; #EEK! We all know what that OO means! use strict; #of couse use warnings; #ditto #I'm making an instance of it... obviously... my $checker = new Data::Validate::OO( -failure => sub { my($field,$data) = @_; print "Data $data in $ +field illegal!"; } ); $checker->newrule( -name => 'homephone', -element => 'phone', #Not required, defaults to name -tests => { -custom =>[ qr/^\d{3}[\-\s]?\d{3}[\-\s]?\d{4}$/, #sorry if that's not qui +te right, writing it on the fly ], }, ); $checker->newrule( -name => 'name', -tests => { -custom => [ qr/\w*\s*\w\.?\s*\w*/, ], } ); $checker->newrule( -name=>'mail', -required => 0, #Defaults to 1, not required, still complains if da +ta is there but fails tests -tests => { -def => 'email', #Use a built-in check for valid email -custom =>[ sub{ my $data = shift; if($data =~ m/\@hotmail\.com$/){ return; } return 1; } ], } ); #Ok, we have a simple check now, try using it. my $CGI = new CGI; my $status = $checker->test($CGI->param()); #Testing incoming form dat +a! #or my $status = $checker->test(name=>'John R. Doe',phone=>'000-111-2345', +mail=>'me@myhost.com'); #This should result in nothing being printed, and true being returned my $status = $checker->test(name=>'a',phone=>'3553451634',mail=>'me@my +host.com'); #Would complain about the name and return false my $status = $checker->test(name=>'My R. Name',phone=>'344-234-2525',m +ail=>'u@hotmail.com'); #Would fail because even though an email is no required, it was not on +e that could be accepted (not empty or valid) my $status = $checker->test(name=>'Your A. Name',phone=>'1323445432'); #This would pass because the email is not defined #Now that you know the data is or is not valid, do something!

Note: The above was just spontaniously written, with no thought as to the accuracy of any of the regex expressions or other assorted mistakes. If I'm wrong... oops... go ahead and point it out, but give your opinion at the same time!

So there's my idea, it's very similar to what already exists, but I feel that in some way it's better, perhaps because of the easier methods of adding custom checks. Obviously there is some more syntax planned for the rules definitions, as well as using the 'name' to allow the adition and deletion/editing of rules (needed it for one set, but no need for another trial).

Now the reason for this post, is this something the community would like, or would it just be wasing space on CPAN doing something that several other modules already do? Basically, should I spend the time to make this idea happen? (For that matter, is there already a package that does this exactly as it'm describing it?)



Edit: Cleaned up an *oops*

My code doesn't have bugs, it just develops random features.

Flame ~ Lead Programmer: GMS | GMS

Replies are listed 'Best First'.
Re: New Module Consideration?
by Juerd (Abbot) on Dec 31, 2002 at 23:29 UTC

    Use a built-in check for valid email

    I quote from perlfaq9:

    Without sending mail to the address and seeing whether there's a human on the other hand to answer you, you cannot determine whether a mail address is valid. Even if you apply the mail header standard, you can have problems, because there are deliverable addresses that aren't RFC-822 (the mail header standard) compliant, and addresses that aren't deliverable which are compliant.

    Many are tempted to try to eliminate many frequently-invalid mail addresses with a simple regex, such as /^[\w.-]+\@(?:[\w-]+\.)+\w+$/. It's a very bad idea. However, this also throws out many valid ones, and says nothing about potential deliverability, so it is not sug- gested. Instead, see http://www.cpan.org/authors/Tom_Christiansen/scripts/ckaddr.gz, which actually checks against the full RFC spec (except for nested comments), looks for addresses you may not wish to accept mail to (say, Bill Clinton or your postmaster), and then makes sure that the hostname given can be looked up in the DNS MX records. It's not fast, but it works for what it tries to do.

    The RFC compliancy test is nice, but allows more than most people want to. What kind of test does your built-in do?

    - Yes, I reinvent wheels.
    - Spam: Visit eurotraQ.
    

        Do you, by any chance, know of a way to get Email::Valid to work on Win32? I have been unable to locate a ppm or use CPAN to install it myself (For that matter, I can't seem to get Net::DNS to install either.)



        My code doesn't have bugs, it just develops random features.

        Flame ~ Lead Programmer: GMS | GMS

        Also: RFC::RFC822::Address.

        Abigail

      Nested comments happen in practice so if that code doesn't handle them then it's not powerful enough for real-world use. Anyone know some code that actually handles addresses fully?

      Update Or maybe that's Aristotle's Email::Valid. I am so tired this morning


      Fun Fun Fun in the Fluffy Chair

        Umm, can you clarify what you meant there? What nested comments?



        My code doesn't have bugs, it just develops random features.

        Flame ~ Lead Programmer: GMS | GMS

      I would be considering several options, most likely a simple compliance check, but since it has not yet been written, I can't say for sure. But since I'm asking if I should at all, I am, of course, open to suggestions as to how to do it.



      My code doesn't have bugs, it just develops random features.

      Flame ~ Lead Programmer: GMS | GMS

Re: New Module Consideration?
by djantzen (Priest) on Dec 31, 2002 at 23:38 UTC

    With the caveat that I've never used either of the modules you're thinking about reinventing, I do have some thoughts.

    Since you're basically adding new functionality, this is a case where you want to think about subclassing existing modules. Both of the two you mention are object-oriented, and so it seems you ought to be able to do this without much trouble (although Data::Validator::Item may pose slightly more of a challenge as it's designed as a factory class). Subclassing will have two major advantages: You'll inherit behavior that you don't want to replicate needlessly, and you won't further pollute the CPAN namespace. Compare Data::Validate::OO to, say, Data::FormValidator::Extensible.

      I think people should think about aggregation far more often than about inheritance. A has-a relationship is very likely to be more appropriate than an is-a one when the desired interface differs significantly from what the existing class already offers.

      Makeshifts last the longest.

        I agree, if the interface is going to be drastically different then you should not try to subclass. To do so would probably involve overriding the parent's methods with dummy placeholders to prevent the user from accessing inherited but unwanted behavior. In this case though, I'd be surprised if it were not possible to expand the interface to allow extensible argument checking via this newrule method he's proposing, without breaking compatibility with the parent module. But if I'm wrong I fall back on my original caveat :^p

Re: New Module Consideration?
by Anonymous Monk on Jan 01, 2003 at 12:03 UTC

    If it's feasible to submit additions to existing modules, do that. If not, or the author is uncoorperative, don't hesitate to write your own. Worrying if it's totally accepted by the community or if Juerd and the like will flame you just stifles innovation.

Re: New Module Consideration?
by jimc (Sexton) on Jan 02, 2003 at 23:32 UTC
    IMHO - the most missing feature in the *Validate* modules is one that allows simple/clear expression of $x+$y == 42 type constraints, so this is the feature that would most justify a new module. Most (Data::Validate anyway) seem geared for required/optional tests.

    you should also look at Params::Validate.

    Ive yet to use any of the *validate* modules, but this one is on the top of my short-list. It has - to me - an intuitive interface thats geared for validating args to a function. It handles both named-params and positional, tho named is (always) clearer.

    sub snafu {
       validate (@_, { foo => 'SCALAR', # foo must be scalar
                       arry => 'ARRAY', });
       # func body here
    }
    
    
      "$x+$y == 42"? I'm interested in what you're saying, but you lost me, what would you be attempting to validate?



      My code doesn't have bugs, it just develops random features.

      Flame ~ Lead Programmer: GMS | GMS

        As an example, if you've got a form that says "rank the following ten items from most important (1) to least important (9)", they must sum to 45.

        Completly unrelatedly, what I really think the world needs is a form validation module that will both validate the form after it's submitted for security, and generate JS to validate the form before submittion to be nice for the user.


        Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).

        '"$x+$y == 42"? I'm interested in what you're saying, but you lost me, what would you be attempting to validate?'

        The meaning of life, the universe and everything, perhaps? ;)

        Or do you think that's a little too ambitious for Perl?

        __________
        "Every program has at least one bug and can be shortened by at least one instruction -- from which, by induction, one can deduce that every program can be reduced to one instruction which doesn't work." -- (Author Unknown)