Unless you want your command entry to quickly become the main focus of your game, you might consider using a simplified grammar and lexicon that shoots for about 90% interpretation accuracy at about 70% precision. Just make sure you know what your verbs (V), nouns (N), and adjectives (A) are.

So, as far as syntax is concerned, with English you have an advantage for commands. Commands start with verbs and have zero or more arguments having something to do with the verb's action. (I imagine you are *doing* stuff in your game, so I wouldn't bother accounting for stative verbs.)

Your command boils down to:

create(mouse)
where mouse(small, white)

So you have the combinations V(N) and N(A,A)... and there you have your objects and their properties. You might also want some stemming, so the Lingua:: modules on CPAN (Lingua::Stem::En) should be of some help.

Each verb will have to have its arguments defined. Your example of "create" can have one argument type, which is whatever you are creating. You will also have to check to make sure the thing you are creating is creatable, i.e. your lexicon will have to know which actions apply. You might also need to account for movement. Luckily, movement is well-studied and very well-formalized. You move FROM (LOCATION) VIA (LOCATION) TO (LOCATION). Here, you can use keywords to map your path.

I've got to run now, but I'll be more than happy to respond in more detail later. (I came back.)

You might want to look at http://citeseer.nj.nec.com/ and search for "parsing english" (query results). You'll find lots of academic articles... somewhere there's bound to be some introductory material. If that fails, then look for a copy of Natural Language Understanding by James Allen (link) at the University of Rochester.

Update: Fixed some sloppy grammar, added more detail.

--
Damon Allen Davison
http://www.allolex.net


In reply to Re: Parsing english by allolex
in thread Parsing english by wolis

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.