Once you have your list of dictionary words read from its file and all cleaned up (whitespace, newlines, etc. fixed up), the next step is to realize that very fast lookup of this sort can be had from a hash:

c:\@Work\Perl>perl -wMstrict -le "use Data::Dump; ;; my @cleanwords = qw(hut Hat foo HIC); my @allwords = qw(hit Het HAT HiC hAc hoc); ;; my %dict = map { $_ => 1 } map canonicalize($_), @allwords ; dd \%dict; ;; for my $word (@cleanwords) { my $common = canonicalize($word); printf qq{word '$word' %sin dictionary \n}, exists $dict{$common} ? '' : 'NOT '; } ;; sub canonicalize { return lc $_[0]; } " { hac => 1, hat => 1, het => 1, hic => 1, hit => 1, hoc => 1 } word 'hut' NOT in dictionary word 'Hat' in dictionary word 'foo' NOT in dictionary word 'HIC' in dictionary
This approach works very simply and quickly for dictionary sizes up to a few tens of million words — but how many words does German have, anyway?

Update: Added a few more words to both word lists in example code.


Give a man a fish:  <%-(-(-(-<


In reply to Re: Search element of array in another array by AnomalousMonk
in thread Search element of array in another array by better

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.