Re: Computer-readable thesaurus
by clemburg (Curate) on Oct 12, 2001 at 22:20 UTC
|
| [reply] |
|
Additional Information:
Dan Brian the author of the above mentioned TPJ
article has also founded the
Linguana Project to produce
an open source natural language processing system based on
Wordnet,LinkParser, and other still not developed applications.
A paper about Linguana you will find at
Dan Brians
Project
Page.
Hanamaki
| [reply] |
Re: Computer-readable thesaurus
by MZSanford (Curate) on Oct 12, 2001 at 17:49 UTC
|
Rather than a dictionary/thesaurus, might i suggest Lingua::EN::Infinitive ... or just the Lingua namespace in general.
The requirements change because they don't know what they want, or how much they own you. | [reply] |
|
Probably
Lingua::Stem may be the module you want to try.
Hanamaki
| [reply] |
Re: Computer-readable thesaurus
by perrin (Chancellor) on Oct 12, 2001 at 18:11 UTC
|
Jon Bjornstad's program to help his disabled friend read includes an interactive dictionary, which I think he got from Project Gutenberg. Take a look. | [reply] |
Re: Computer-readable thesaurus
by cheshirecat (Sexton) on Oct 13, 2001 at 01:16 UTC
|
Moby Hyphenator
185,000 entries fully hyphenated
mhyph.tar.Z [980kB]
Moby Language
Word lists in five of the world's great languages
mlang.tar.Z [2.3MB]
Moby Part-of-Speech
230,000 entries fully described by part(s) of speech, listed in priori
+ty order
mpos.tar.Z [1.2MB]
Moby Pronunciator
175,000 entries fully International Phonetic Alphabet coded
mpron.tar.Z [3.1MB]
Moby Shakespeare
The complete unabridged works of Shakespeare
mshak.tar.Z [2.3MB]
Moby Thesaurus
30,000 root words, 2.5 million synonyms and related words
mthes.tar.Z [12MB]
Moby Words
610,000+ words and phrases. The largest word list in the world
The Cheshire Cat (...is back) | [reply] [d/l] |
Re: Computer-readable thesaurus
by mischief (Hermit) on Oct 13, 2001 at 19:30 UTC
|
You might want to take a look at dict.org and the files on their ftp site. They have several databases available along with client and server software you can use for reference.
| [reply] |
(tye)Re: Computer-readable thesaurus
by tye (Sage) on Oct 12, 2001 at 19:03 UTC
|
| [reply] |
Re: Computer-readable thesaurus
by pjf (Curate) on Oct 12, 2001 at 18:51 UTC
|
Most *nix systems come with a dictionary of words, commonly in /usr/dict/words or /usr/share/dict/words.
The common spelling utility, ispell, also comes with its own dictionaries, although the format isn't quite as simple as that of /usr/dict/words. If you have ispell installed, then you might want to glance at /usr/lib/ispell or /usr/local/lib/ispell to see if you can spot them. (Look for .hash files).
These obviously don't contain word definitions or roots, just the words themselves. However often you can infer the root word using English rules. Again, the ispell source code would probably be a useful start here, since it does exactly that.
Cheers,
Paul | [reply] |
Re: Computer-readable thesaurus
by Fletch (Bishop) on Oct 12, 2001 at 19:56 UTC
|
webster () {
_gensearch $0 "http://www.m-w.com/cgi-bin/dictionary?va=" "$*"
}
thesaurus () {
_gensearch $0 "http://www.m-w.com/cgi-bin/thesaurus?va=" "$*"
}
| [reply] [d/l] |
|
I'm pretty sure Merrian-Webster online would not
like people bypassing their ad revenue in this fashion.
I know I pay for the bandwidth on my site, and spend some time blocking agents that steal my content like that.
contact publishers before 'using' their work is my advice.
Tiago
Update: I'm sorry if this was read as flame bait, not my intention. I'm not sorry to have brought up copyright and terms of service. This is not a technical issue, but a moral and sometimes legal one, that developers should be aware of when making agents for the web.
For example, using most Finance::Quote:: modules are against the terms of service of the sites who provide this data.
| [reply] |
|
I'm sure they'd also not like for people to use lynx,
which doesn't display adds. I'm sure they'd like for
people not to use junkbuster or other ad-blocking proxies.
I'm sure they'd like for people to mail them large
envelopes full of cash.
But they've put up their content on a publicly accessable
web site. They're perfectly welcome (as are you) to implement
whatever technological means to restrict access (of course
most of those won't stop a truly determined person with
the right know-how, but that's another issue :).
But I see little reason to ask for permission to provide
an URL which any webmonkey worth his bananas could deduce
in under a minute with just a browser's `View Source'
functionality. That URL does not magically give you any
more access to their content than the form on their front
page, just more convinient access.
But this is getting off topic from the original question
at hand.
| [reply] |
|