Hi All

I have a part of an application, where from a given (partial) user input a best guess should be made to match what the user is searching for.

In this example its in terms of locations, so the user wants to look for a given sub set of locations in say Paris, given that they have entered Paris, and optional search terms of 'Place De La Gare' and 'Rennes'. From the Paris part the application can simply find all Paris entries, lets say its the following subset of data

A simple script to *guess* from the subset is as follows

#!/usr/bin/perl -w use strict; use Data::Dumper; my @data = ( 'Place De La Gare - Angers', 'Place De La Gare - Nevers', 'Place Mohammed V - Oujda', 'Place De La Gare - Rennes', 'Place de la Gare - Quimper', 'Place Thiers - Nancy', 'Place De La Gare - Grenoble', 'Place Du Chateau - Galerie Marchande Du Rer', 'Place De La Gare - Angers', 'Place De La Gare 1 - Bannes Grenoble', 'Place De La Gare - Nevers', 'Place De La Gare - Rennes', 'Place De La Gare bannes', 'Place de la Gare', 'Place de la Gare - Bergerac', 'Place de la Gare - Moutiers', 'Place de la Gare - Libourne' ); my @guesses = ('Place de la Gare', 'Rennes'); my @list = @data; foreach my $guess(@guesses) { my @guessed = grep { /$guess/i } @list; # Now guess in reduced list @list = @guessed; } print "Guess this is what you wanted ? ", join "\n", @list;

Is there a better way, given this is a reduced subset of the real data (could be hundreds of possible choices) and multiple words to reduce the guessing ?

The next step would be to start weighting the guesses, in this case it will remove anything which is not a exact match to the terms, any *clean* ways of doing the weighting ?


In reply to Guessing/Ordering Partial Data by ropey

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.