I recently wrote a small search engine for a site.
It consists in two script: a script that reads up
HTML pages to create an index (some db-files)
and a CGI script to consult this index.
When the user enter a word, CGI script looks up the hash
table in order to pick the corresponding ID.
Then, it searches for that ID in another hash to find
files containing that word.
Something like this:
$id = $Words_db{$word};
foreach $i (keys %Index_db) {
if ($i == $id) {
@fileId = split( /:/, $Index_db{$i});
foreach $fId (@fileId) {
# ...
}
}
}
It works just fine, thanks to hash tables.
Now I'd like to allow users to write only pieces of words to
perform the search (e.g.: "man" will match "man" and "maniac").
In this case, I'd have to modify the code. Something like:
my $piece;
foreach (keys %Words_db) {
if ( ... ) {
# if $piece is a substring of $_
...
} else {
$piece does not occur in $_
...
}
}
I didn't try that, because it seems to be too inefficient.
I'd be glad to see your suggestion. Thank you.
Larsen