I really haven't played with Geocoding beyond the manual tweaking stage. The only two things that have really jumped out at me are numeric street names (which mapping hash might help with) and dropping the street extension (ie: sometimes people say "Haste Ave", but it's really Haste Street" ... if you drop the "Ave" and just look for "Haste" it might figure out what you want.

I imagine that if i was really doing a lot of geocoding in bulk, i would run some tests to feed huge number of addresses in, and generate two logs: addresses that can't be parsed, and addresses that can be parsed but not located. then i would manually review the lists (seperately) and try to find patterns, then pick a few examples, and see if an easy rule fixes those examples -- if so, try it on all addresses that match the pattern, and add thta rule to your code base.

Note also the update to my orriginal reply ... take a good hard look at that method. It says it will apply the individual pieces of the address of the address one at a time, and skip any that will result in no matches -- which may be helpfull in figuring out what the test is that causes your problem addresses to fail. (you might have to add some logging to it, or pull outhte logic into your own method -- but it might give you a good starting point)

Lastly: if you've got a really large list of addresses that it can't find, contact missing {at} geocoder.us. it says right on their site that they are interested in hereing about legitimate addresses that can't be found, maybe they can spot hte pattern and provide a fix for Geo::Coder::US


In reply to Re^3: Looking for a cheap, Perl-friendly GeoCoding service by hossman
in thread Looking for a cheap, Perl-friendly GeoCoding service by sgifford

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.