Unfortunately, dws's approach isn't always possible.

The system I work on handles significant volumes of addresses, using dedicated (commercial) software to handle the identification and validation of that information.
There is inhouse processing to help smooth things out, but not every problem can be catered for.

The volumes involved prevent it being practial for humans to process the problem records (and in fact some of those problem records are a direct result of human input).

Even if the volumes were low enough for humans to be able to process exceptions, humans can't get it right all of the time.
This might be because of a lack of information, poorly laid of information or just human error.

As shemp describes, even reference data used in such validation systems isn't perfect, and this sets the upper limit to what you can reasonably expect to acheieve.

I'd say that you can never expect to get things 100% right, and it might end up being cheaper and/or easier to accept a certain error rate.
Of course, your client may not accept this, but that's a whole other problem :-)

Cheers.

BazB


If the information in this post is inaccurate, or just plain wrong, don't just downvote - please post explaining what's wrong.
That way everyone learns.


In reply to Re: Re: regex negative lookahead behaviour by BazB
in thread regex negative lookahead behaviour by shemp

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.