I've been working on a project involving linguistics and programming for school, attempting to write functions (or perhaps subroutines now) capable of returning poetic devices, but I've been implementing the project in PHP. There is very little that the user must input to the program, so the GUI of the browser is not at all important.

What's most important is the efficiency of database queries. The program uses IPhOD, a phonotactic database that gives information about syllables and sound, which I use to return lists of rhymes, and I plan to use WordNet and VerbNet extensively. I've already edited a table of word morphology exceptions in WordNet, adding the type of inflection to the database, so that my morphology function functions. I've also read a lot about OpenCyc, but I've had no success with using it. In my imagination OpenCyc could be used to return fundamental information about WordNet synsets that the program could take to build intelligent metaphors, and from what I've read it's possible.

If anyone has experience in linguistics, it's likely that they've heard of systemic grammar. My plan at first was to generate poems on which an adaptation of the Turing test could be run, but now I'm very worried that the project is too big (hints the new goal: functions as poetic devices). I wanted to incorporate systemic functional grammar into my program to generate phrase structure, the backbone of the lines.

The reason I'm giving this background information is that I'm seriously thinking about reimplementing the program in Perl. I'm very new to Perl, but the code is visually very similar to PHP, and so the switch would be easy. What benefits might I gain from such a reimplementation? What modules might offer help? And if anyone has any experience in incorporating OpenCyc into Perl that would be more than wonderful, or rather if anyone has experience with programming any of this linguistics stuff, I'd really appreciate the info, because my adviser has no clue what I'm doing.

Thanks for any help you can give

-Justin

Replies are listed 'Best First'.
Re: Poetry in the Machine
by GrandFather (Saint) on Oct 07, 2007 at 20:30 UTC

    Have a dig around CPAN. There are modules in the Lingua name space that may help (although I suspect you are a step or two beyond what they are aimed at). There are a lot of modules aimed at facilitating WordNet access however and a small number for VerbNet.


    Perl is environmentally friendly - it saves trees
Re: Poetry in the Machine
by Gavin (Archbishop) on Oct 07, 2007 at 20:36 UTC
Re: Poetry in the Machine
by roboticus (Chancellor) on Oct 08, 2007 at 04:17 UTC
    framboise:

    Why are you tempted to rewrite it in perl? Don't get me wrong--I *love* perl. But I hate to rewrite anything when I have it already written.

    I don't remember much PHP (it's been about 7 years, so anything I remembered would be out of date anyway), so I don't know what benefits you'd gain. But all languages have a few potholes here and there. So when you rewrite it, unless you're going to rearchitect bits of it as you go, you'll find out that you'll have to dodge a few perl potholes, and your code would be littered with the remains of translated PHP potholes. That (and the time investment) would probably offset any gains perl might give you.

    Just askin'.....

    ...roboticus

      Most of the work that I've put into the project has been research and databasing, so the PHP I've written isn't so extensive that it would be incredibly hard to rewrite, and most of it will need revision as I try to incorporate it into a larger system.

      Perl is more dynamic than the PHP that's locked up in my server and from what I've read can probably be integrated more easily into other systems. This might actually become more of an issue later if I work with systemic grammars. Perl modules will allow me to cut some corners, and I'm all for that. I already see that there are some modules that do almost exactly what my stuff does. Lastly, although it at first didn't seem like much of a factor, the Perl community is more centralized and geared towards this type of work. I found very little documentation on NPL+PHP, because that's just not what PHP is about.
Re: Poetry in the Machine
by mr_mischief (Monsignor) on Oct 09, 2007 at 12:49 UTC
    Not only are there some strong NLP people in the Perl community, but the language was designed by a man educated as a linguist. The initial role of Perl was text processing. It has grown well beyond just that, but it still handles text better, faster, and more comfortably for the programmer than many other languages.

    I'm not personally very well versed with the linguistics modules on CPAN, but I've heard very good things about some of them. I've looked briefly at several because they fascinate me (I intended to be a linguist myself at one point), and I can tell you there's much more value on CPAN for NLP from what I can tell than there is out there for PHP. In the interests of full disclosure, Python and Java also seem to be well-represented in the NLP arena, though.

    As long as you're looking at alternatives, I'd suggest looking at http://opennlp.sourceforge.net/projects.html, where a number of Open-Source NLP packages are described.