Hi guys

I need to parse addresses. But unfortunately the address string is simple text string without any separators. Here are some examples:

I must get from this string Address, City and Postal code. So for given strings it will be:

It's not so hard to get Postal code:

my ($postal) = $dirty_address =~ m/(\w+){2}$/;

My main problem is separating city name and street name (cause city name maybe several words). Can you suggest something to me? Maybe there is module that do such things?

It this moment my only idea is to look to the first occurence of the street prefix (St, Rd, Drv) from Postal code. When i find it -- part from prefix till Postal code is city name. But I don't know all prefixes :-(

P.S. This is Yellowpages search results.


In reply to Parsing addresses by Gangabass

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.