rkg has asked for the wisdom of the Perl Monks concerning the following question:

How do I check if a word is a plural? Lingua::EN::Inflect will pluralize a singular noun; can it (or something else?) be used to check plurality?

Thanks

rkg

Replies are listed 'Best First'.
Re: is a word plural?
by allolex (Curate) on Jan 22, 2004 at 22:56 UTC

    This problem is seriously non-trivial because to do this, you need to have a lexicon that lists all the forms and says whether they're singular or plural. There are a lot of irregular plural forms in English, plus there are a lot of ambiguous singular forms that a simple "is there an 's' at the end?" algorithm would need to account for (like "bus").

    Your best bet is to process the text you need the plural forms for via a POS (Part of Speech) Tagger like the Tree Tagger at the Department of Linguistic Processing at the University of Stuttgart. It's free as in beer, but you can't play with the code (which is C, so it runs very fast).

    There are other taggers out there, including one packaged with a bunch of Perl tools called Xlex (not the remedy for constipation). It's written in C++ and you can play with it online. They will send you the whole Xlex system if you ask them nicely, but you'll have to wait for them to respond to your e-mail.

    --
    Allolex

Re: is a word plural?
by hardburn (Abbot) on Jan 22, 2004 at 20:44 UTC

    Doesn't look like it, and this seems to be a highly non-trivial problem, too, even if you keep yourself to English (perhaps especially English, since English has some really bizzare rules). So the plural form of "car" is "cars", right? How about "virus"? Is that "virii" or "viruses"? (Major flame war item there). Some programmer communities plurlize "regex" as "regexen", not "regexes". So this isn't just language-specific, but culture-specific, too.

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    : () { :|:& };:

    Note: All code is untested, unless otherwise stated

Re: is a word plural?
by theguvnor (Chaplain) on Jan 22, 2004 at 20:59 UTC

    I would point you to Tom Christiansen's classic post about the difficulty of pluralising English.

    [Jon]

Re: is a word plural?
by DrHyde (Prior) on Jan 23, 2004 at 09:49 UTC
    Is "sheep" singular or plural? To figure that out, you need to see it used in context, and understand that context. Compare:
    • My sheep is woolly
    • My sheep are woolly
    If I have many people called "Edward", you can pluralise it by saying:
    • There are many Edwards here
    So can we then conclude that "James" is plural?

    Then consider the -en plurals - children, men, women, Vaxen and so on. So "vixen" should be plural, right?

    Basically, it can't be done without just encoding the dictionary.

Re: is a word plural?
by artist (Parson) on Jan 23, 2004 at 04:47 UTC
    Why you need to do that and how big or unusual your words are?. You can find plural of every word in your dictionary with the help of that module and match your word against them.
Re: is a word plural?
by extremely (Priest) on Jan 23, 2004 at 18:29 UTC
    Franks franks Frank's franks. Chucks chucks Chuck's chucks. Bill's Bills bills bills.

    Those are all sentences. I'd hate to try and parse out the plural words with a generic piece of code.

    --
    $you = new YOU;
    honk() if $you->love(perl)