Fellow Monks,
I am faced with a challenge to extract clients names and addresses from a bunch of Word documents
I came to the conclusion that processing raw text would be easier than trying to parse a word formatted document, so using Win32::OLE I open the documents and save them as text only, however now I come to the part of extracting the address data from it and before I start would ask for some advice
So has anyone done something similar to this before ? the obvious choice would be a regex, but given that the format of a name and address could vary considerably (consider MR and Mrs D.M Smith, Mrs & Mr D Smith-Brown etc) and an address could vary even more, so before I re-invent the wheel, has this been done before ? searching CPAN there are modules such as Geo::PostalAddress or Lingua::EN::AddressParse which do something similar, but do not 'extract' the address from a raw text document ?
Has anyone faced a similar problem ? and could advice on how to resolve ?
In reply to Extracting a (UK) Address by ropey
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |