in reply to Re: Merge/Purge address data
in thread Merge/Purge address data

Heh,

No stranger to third normal form am I. I think you miss my point.

Information coming off a web form where a strict "log in with a password to use the site" policy is NOT in effect will contain tons of blank records and near duplicates. But making users do the extra work of selecting a user name and password raises a barrier, which means we get a lot less data.

The solution I have in place:

  1. gathers data from a web form into one set of tables
  2. cleans up that data
  3. merges cleaned data into another table for other work

I'm looking to automate steps 2 and 3.

Replies are listed 'Best First'.
Re: Re: Re: Merge/Purge address data
by johndageek (Hermit) on Nov 11, 2003 at 18:00 UTC
    Depending what your site is offering, you may want to give an option for a user to select a "code" that they can enter to automatically fill in their name and address.

    This of course assumes the user has some reason to fill in address information with some attempt at accuracy.

    Just some thoughts on how dageek might try it. try to create some standardized wording. Split on white space, replace MD with DR in name fields (you will build a list of others). In the Address fields, replace NORTH with N., South and So with S. and so on.

    Now compare for exact match for words and numbers.

    Store "standardized" name and address in main database for future compares.

    You have a good piece of work ahead of you no matter what method you choose - Good Luck!

    dageek