in reply to Re: Common Words, Perl Keywords
in thread Common Words, Perl Keywords

I'd like to question you about your methods and sources. First of all, where did you get these words from? How big was the text source? What were your criteria for defining what "word" means?

I got them from an about.com site for teachers of English.

I have no idea of the provenance or the sample.



($_='kkvvttuubbooppuuiiffssqqffssmmiibbddllffss') =~y~b-v~a-z~s; print

Replies are listed 'Best First'.
Re: Re: Re: Common Words, Perl Keywords
by allolex (Curate) on Nov 26, 2003 at 00:12 UTC

    The source is attributed to "Jerry Jones" (big help). Anyone interested in looking at the list can see it here. It's supposed to be North American English, if anyone cares.

    --
    Allolex

Re: Re: Re: Common Words, Perl Keywords
by halley (Prior) on Dec 01, 2003 at 15:33 UTC
    The Moby Lexicon project, now concluded, has several different slices of the dictionary; it found the most common words in a couple of different samples, and ranked them by prevalence. This kind of data is very useful for certain search analysis: rank a match which hits a less-common word higher than a match on mundane words. I was doing some work on protocol compression and canonical word numbering as well. The Moby Lexicon can be found with Google, and has other goodies like parts-of-speech, hyphenation, common person names by gender, and a few studies of other languages.

    --
    [ e d @ h a l l e y . c c ]