I'm just trying an algorithm that does a fuzzy search for a pattern that Tachyon originally posted about a bioinformatics question. It is currently outputting the line, number of misses and I've got it to print the sentence that the occurrence appears in but I'm trying to get the word as well and I'd be grateful for some help. I'm just experimenting in applying some bioinformatics to text analysis (following a conversation with an acquaintance) and will be looking at using stop words, inflections and corpora in due course.
use strict; use warnings; my $word = "scrooge"; my @find = map ([split //], $word); my $find_len = length($word); my $fuzzy = 2; while (my $search = <DATA>) { chomp $search; $search = [split //, $search]; for my $i ( 0..@$search-$find_len ) { FIND: for my $find ( @find ) { my $misses = 0; for $j ( 0..$find_len-1 ) { $misses++ if $search->[$i+$j] ne $find->[$j]; next FIND if $misses > $fuzzy; } print "Line $. Match ($misses) at $i, @$search\n"; } } } __DATA__ STAVE I: MARLEY'S GHOST MARLEY was dead: to begin with. There is no doubt whatever about that. The register of his burial was signed by the clergyman, the clerk, the undertaker, and the chief mourner. Scrouge signed it: and Scrooge's name was good upon 'Change, for anything he chose to put his hand to. Old Marley was as dead as a door-nail. Mind! I don't mean to say that I know, of my own knowledge, what there is particularly dead about a door-nail. I might have been inclined, myself, to regard a coffin-nail as the deadest piece of ironmongery in the trade. But the wisdom of our ancestors is in the simile; and my unhallowed hands shall not disturb it, or the Country's done for. You will therefore permit me to repeat, emphatically, that Marley was as dead as a door-nail

In reply to Trying to find a word in a fuzzy search algorithm by Quicksilver

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.