comment on

salutations, thank you for the responses. based on the responses we got, we tried it several ways: we tried to create a Microsoft Access database and access it via Win32::OLE, but it didn't seem to be convenient (because we would have to convert the txt to mdb always when the dictionary had some alteration), and we also tried other ways, like putting the line of the dict file on the hash only when it matches the user input (but it wouldn't very convenient for multiple word inputs, which we plan to implement). so, we concentrated on the hash approach, which seemed to be the best one, for returning a hash value of a certain key is very fast, even for a file of a very big line number. the only problem was generating the hash from the dict file, line by line. so, we found the "Storable" library (is this name correct?). thus, the hash file would be generated only once, stored in the file by the store() function, and then retrieved from the file:

#!c:\perl\bin\perl
use Storable; #use this for calling the hash storing functions.

my %dict;  # the hash

while (my $line = <DATA>) {
    chomp $line;
    $dict{$line}++;  # instead of ++ you could also assign some value.
+..
}
store(\%dict, "hash.txt"); #store the hash in the file.

%dict = %{retrieve("hash.txt")}; #retrieve the hash from the file "has
+h.txt". thus, it only needs to be generated once.
my @inputs = qw(
foo
fooed
fooen
prefoo
postfoo
);

for my $input (@inputs) {
    print "found '$input' in lexicon\n" if exists $dict{$input};
}
[download]

by using the very fast hash, retrieving it from a file previously generated, it works very faster. so, again, thank you for the responses, they were very helpful for finding a good approach for our problem. note: we are not professional linguists nor professional programmers (we can't work yet, anyway, because of our age), we do this just because we like (the programming part and the lingustic part). by the way, we are from Brazil, and we are twins (it explains the "we"'s). salutations.

In reply to Re: reading dictionary file -> morphological analyser by Anonymous Monk
in thread reading dictionary file -> morphological analyser by pc2

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.