Re: Natural language text processing
by dws (Chancellor) on Jul 08, 2003 at 22:37 UTC
|
Does anyone know of software or methodology for compredending natural language?
The old (and, alas, defunct) Perl Journal had several articles on natural language processing. Fortunately, these have been reprinted in Computer Science & Perl Programming: Best of TPJ, which contains many other fine articles.
You might also get some mileage out of Lingua::LinkParser.
| [reply] |
Re: Natural language text processing
by blokhead (Monsignor) on Jul 09, 2003 at 01:57 UTC
|
I'd start by examining how existing CPAN modules parse natural language. The modules in the Lingua::* namespace might provide some useful starting points.
Also, off the top of my head: Time::Human kinda does what you're talking about, but in the other direction. Date::Manip accepts a wide range of "natural language" input for dates, so that might also be a great start for you.
Update: From the Date::Manip POD:
$date = ParseDate("today");
$date = ParseDate("1st thursday in June 1992");
$date = ParseDate("05/10/93");
$date = ParseDate("12:30 Dec 12th 1880");
$date = ParseDate("8:00pm december tenth");
The range of input accepted by this module might eliminate a lot of work for you!
blokhead | [reply] [d/l] |
Re: Natural language text processing
by TomDLux (Vicar) on Jul 09, 2003 at 02:35 UTC
|
This is somewhat difficult because people in Britain use different phrases than people in the US, who differ from Canadians, who disagree with Australians, etc.
More exasperatingly, Bostonians use different expressions that Denverites, and Tennessee rural folk don't use Valley Speak---it gags them with a spoon.
Worst of all, Every two to five years a new micro-generation comes along with a desperate craving to use expressions their parents won't understand.
In other words, you're doomed.
--
TTTATCGGTCGTTATATAGATGTTTGCA
| [reply] |
Re: Natural language text processing
by Corion (Patriarch) on Jul 09, 2003 at 07:17 UTC
|
Natural language processing can be manageable if you confine yourself to a particular problem domain. For example, one of my pet projects that eventually will see light is a module to parse "natural" descriptions for dates for a cron-style scheduler. The idea is that most of these descriptions are phrases that my module will recognize will either have been copied from the synopsis or they will match some simple patterns :
- The third wednesday of the month
- The first monday after the first tuesday
- Every friday
For my case, there is no deep understanding necessary, as the only ordinal words that can occurr are first,second,third,fourth,fifth, the atomic phrases are simple, and the only composite phrases are atomic phrases concatenated by before and after. There are some implicit assumptions like that of the month is implicitly added if no month is given, and that the next date in the future is selected (that is, a date lies either in the current month or in the month after that if no absolute date has been specified).
This is by no means a module that "understands" the text given, but with my external knowledge about the supposed content, it can extract and convert the data given.
perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The
$d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider
($c = $d->accept())->get_request(); $c->send_response( new #in the
HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
| [reply] [d/l] |
|
|
* The third wednesday of the month
* The first monday after the first tuesday
* Every friday
Just in case you need a data structure to store this, you can use
DateTime::Event::ICal from the
datetime project | [reply] [d/l] |
|
|
Does it do Southern?
"A week come Sunday"? "Sunday week", "Thursday last", etc.
How about random stuff
"Three moons ago"
"Six fortnights"
This is hopeless.
| [reply] |
|
|
Re: Natural language text processing
by artist (Parson) on Jul 09, 2003 at 03:34 UTC
|
Along with the listed modules, try using Parse::RecDescent for various date-time text format that you may come across. You can keep adding more formats as you see newer variations.
artist | [reply] |
Re: Natural language text processing
by allolex (Curate) on Jul 12, 2003 at 05:32 UTC
|
I know it's one of those things that computers can't do really well, but if the subject is limited it can't be too hard.
Very funny :) For a start, have a look at the Natural Language Processing FAQ. If you'd like an introduction, have a look at James Allen's book.
--
Allolex
| [reply] |
| A reply falls below the community's threshold of quality. You may see it by logging in. |