I expect you’ll be interested in these: UMLS::Similarity, UMLS::Interface, and a general search of the CPAN in the UMLS::* space. It's not trivial to set up but the instruction are good. Unfortunately there are no free medical stemming dictionaries I'm aware of. This really helps the kind of comparison task you're doing. If you know or learn of any, please do share. There are other approximate matching packages. Some like the metaphone stuff are probably too fuzzy for you and others like String::Approx are probably too slow.

You might want to index all your data first and use something like Lucy (to do stemming, tokenizing, and various normalization once instead of per search/match); or possibly resort to specialized data containers like Judy.


In reply to Re: Improving speed match arrays with fuzzy logic by Your Mother
in thread Improving speed match arrays with fuzzy logic by Takamoto

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.