"... and so on"

Since the specification is fuzzy, let's make a fuzzy matching regex :)
Then match it against the whole corpus as a single string, instead of doing 500,000 individual matches.

#!/usr/bin/perl # https://perlmonks.org/?node_id=1228728 use strict; use warnings; # corpus is now a string instead of an array FIXME for real filename my $corpus = do { local (@ARGV, $/) = '/usr/share/dict/words'; <> }; # fake random input strings FIXME for real strings in @tomatch my @tomatch = map { join '', map { ('a'..'z')[rand 26] } 1 .. 4 } 1 .. + 1e2; for my $string (@tomatch) { my @patterns; # match <2 changes push @patterns, "$`.?$'" while $string =~ /\S/g; # changed or droppe +d char push @patterns, "$`.$'" while $string =~ /|/g; # added char $string =~ /^(.+)es$/ && push @patterns, $1; # singular my $fuzzyregex = do { local $" = '|'; qr/^(@patterns)$/m }; $corpus =~ $fuzzyregex && printf "%35s : %s\n", $string, $1; # FIXME + output }

Besides, I couldn't pass up an opportunity to write perl to write a regex :)


In reply to Re: Improving speed match arrays with fuzzy logic by tybalt89
in thread Improving speed match arrays with fuzzy logic by Takamoto

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.