in reply to Finding dates in unstructured text
Care to be more specific on what looks like a date?
20110112 ?
jan twelve ?
12 Janvier ?
XII 1 MMXI ?
OK, the last couple where a bit silly, but it illustrates the point. If you have a clear idea of what a date looks like, then a series of regular expressions is probably the way to go.
If not, then I would start by training a Bayesian classifier, eg: Algorithm::NaiveBayes to find the bits of text, and then using them as examples to write regular expressions from.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Finding dates in unstructured text
by educated_foo (Vicar) on Jan 12, 2011 at 23:58 UTC |