in reply to Re: Data Salad Address Problem
in thread Data Salad Address Problem

I agree it's horrible. . . and unfortunately, I can be sure of very little regarding the formatting. I see a number of records that put commas between city and state (which isn't really a big problem), and some which abbreviate state names with things like "MASS", and "WASH" (oh, joy).

Thanks for the good wishes. . .

Replies are listed 'Best First'.
Re^3: Data Salad Address Problem
by socketdave (Curate) on Jul 28, 2005 at 15:02 UTC
    You're basically going to have to quantify the different possibilities and allow for them individually. I was able to get the zip codes accurately from your sample data:

    unless ( ($zip) = ($field5 =~ /(\d{5}-\d{4})/)) { unless ( ($zip) = ($field5 =~ /(\d{5})/)) { unless ( ($zip) = ($field4 =~ /(\d{5}-\d{4})/) +) { ($zip) = ($field4 =~ /(\d{5})/); + } } }


    but that's already pretty nasty...
      Actually, I see that the third record from the bottom has a 5-digit ZIP code, with no dash and other part... Could be that we need to make the second part optional... Yeah, oh joy...

      --------------------------------
      An idea is not responsible for the people who believe in it...