This is the base class. It provides some hopefully useful default behaviour and provides the interface definition for any Fuzzy::Matcher implementations.

package Fuzzy::Matcher; use strict; use warnings; use Carp qw(croak confess); use vars qw/$VERSION/; $VERSION=0.01; # This is a base class for fuzzy matchers to inherit. # Its where stuff that will be common to all matchers # is located. It also defines the interface that all # matchers will have to follow. # Constructor CLASS->new($fuzz,$strlen,$words); # # Takes the amount of fuzz to use for matching # and the length of the strings to be matched. # # Should not be overriden. # sub new { my $class = shift; my $fuzz = shift; my $strlen = shift; my $self=bless { fuzz => $fuzz||0, strlen => $strlen, },$class; croak "Failed build!" unless $self; $self->_init(@_); return $self; } # # $obj->_init() # # This is a hook for subclass to override without # having to override the default object creation # process. It is called in void context before the # object is returned to the user with any args # remaining after the default ($fuzz,$strlen) # # By default it is a No-Op. # sub _init { } # # $obj->fuzz_store($string) # # Store a string into the object for fuzzy matching # later. # # Default behaviour is to build a hash of stored strings # for dupe checking and a corresponding array of strings. # The array is named fuzz_strings and the hash is named # str_hash. # # sub fuzz_store { my ($self,$str)=@_; push @{$self->{str_array}},$str unless $self->{str_hash}{$str}++; } # # $obj->prepare($string) # # If necessary a subclass may define this sub so # that any actions that need to occur after # adding the words but before search starts. # # By default it deletes the str_hash entry from the object to # preserve memory. # sub prepare { my ($self,$str)=@_; delete $self->{str_hash}; } # # $obj->fuzz_search($string) # # Search a string for results and return # a reference to a list of matches. The list will be # of triples so that the first match returns: # ($match_ofs,$chars_diff,$string_matched)=@$ret; # # # Must be overriden # sub fuzz_search { confess((caller(0))[3],"() method must be overriden in ". ref($_[0])); } 1;
---
demerphq


In reply to Re^2: Algorithm Showdown: Fuzzy Matching (Matcher.pm) by demerphq
in thread Algorithm Showdown: Fuzzy Matching by demerphq

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.