Angel has asked for the wisdom of the Perl Monks concerning the following question:

I have a list of about 1000 group names and I am running the adistr on them as the docs for the fuzzy string matching shows. How do I get only the top twenty results returned in order closest

sub get_fuzzy_match_groupnames( $ ) { my @result_array; my $self = shift; my $dbh = $self->{'dbh'}; my @possibleMatchName; #holds the groupname #name not found do approximate string lookup my $sqlQuery = "SELECT GroupID, GroupName FROM GROUPINFORMATION WHERE UserID = 0 "; my $query = $dbh->prepare( $sqlQuery ); $query->execute() || die $dbh->errstr; while( @result_array = $query->fetchrow_array ) { push( @possibleMatchName, $result_array[1] ); } my %d; @d{@possibleMatchName} = map { abs } adistr( $_[0], @possibleMatchN +ame ); my @d = sort { $d{$a} <=> $d{$b} } @possibleMatchName; if( length( @d ) > 20 ) { return @d[0..20]; } else { return @d; } }

Replies are listed 'Best First'.
Re: Getiing A Smaller Fuzzy
by poj (Abbot) on Dec 31, 2002 at 16:35 UTC
    I think this might be the problem
    if( length( @d ) > 20 )
    try
    if (scalar @d > 20 )

    poj
      Just for completeness, you can do without scalar
      if (@d > 20)
      will achieve the same thing.

      -- vek --
Re: Getiing A Smaller Fuzzy
by waswas-fng (Curate) on Dec 31, 2002 at 16:33 UTC
    Have you looked into the word distance ,Soundex, and Text::Metaphone modules on CPAN? These may give you a better measuring stick to find sound and look alike matches for you search -- then just put them into a hash and sort out the best 20 hits.

    -Waswas
Re: Getiing A Smaller Fuzzy
by Jaap (Curate) on Dec 31, 2002 at 15:09 UTC
    It seems to me like your code already returns the first 20 results. What is the problem?

      I think he wants the 20 closest results, which may or may not be the ones actually being returned by the current code.