Once you have your list of dictionary words read from its file and all cleaned up (whitespace, newlines, etc. fixed up), the next step is to realize that very fast lookup of this sort can be had from a hash:
This approach works very simply and quickly for dictionary sizes up to a few tens of million words — but how many words does German have, anyway?c:\@Work\Perl>perl -wMstrict -le "use Data::Dump; ;; my @cleanwords = qw(hut Hat foo HIC); my @allwords = qw(hit Het HAT HiC hAc hoc); ;; my %dict = map { $_ => 1 } map canonicalize($_), @allwords ; dd \%dict; ;; for my $word (@cleanwords) { my $common = canonicalize($word); printf qq{word '$word' %sin dictionary \n}, exists $dict{$common} ? '' : 'NOT '; } ;; sub canonicalize { return lc $_[0]; } " { hac => 1, hat => 1, het => 1, hic => 1, hit => 1, hoc => 1 } word 'hut' NOT in dictionary word 'Hat' in dictionary word 'foo' NOT in dictionary word 'HIC' in dictionary
Update: Added a few more words to both word lists in example code.
Give a man a fish: <%-(-(-(-<
In reply to Re: Search element of array in another array
by AnomalousMonk
in thread Search element of array in another array
by better
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |