in reply to Search element of array in another array

Once you have your list of dictionary words read from its file and all cleaned up (whitespace, newlines, etc. fixed up), the next step is to realize that very fast lookup of this sort can be had from a hash:

c:\@Work\Perl>perl -wMstrict -le "use Data::Dump; ;; my @cleanwords = qw(hut Hat foo HIC); my @allwords = qw(hit Het HAT HiC hAc hoc); ;; my %dict = map { $_ => 1 } map canonicalize($_), @allwords ; dd \%dict; ;; for my $word (@cleanwords) { my $common = canonicalize($word); printf qq{word '$word' %sin dictionary \n}, exists $dict{$common} ? '' : 'NOT '; } ;; sub canonicalize { return lc $_[0]; } " { hac => 1, hat => 1, het => 1, hic => 1, hit => 1, hoc => 1 } word 'hut' NOT in dictionary word 'Hat' in dictionary word 'foo' NOT in dictionary word 'HIC' in dictionary
This approach works very simply and quickly for dictionary sizes up to a few tens of million words — but how many words does German have, anyway?

Update: Added a few more words to both word lists in example code.


Give a man a fish:  <%-(-(-(-<