Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Most of my work involves interface-building, and I've found that once the logic and presentation are reasonably together, the biggest factor in the success or failure of the system is the tone and clarity with which it addresses the user.

So i've become obsessed with writing ui scripts that use proper colloquial english. People react so much better to a page that tells them what's going in a conversational way that i almost don't mind bloating the scripts with sentence construction code and obfuscating the templates with grammatical conditionals.

Here's a very simple example, dug out of the middle of a script i'm updating at the moment. The end result is a sentence in the form:

We found 27 campaign updates and case studies relevant to children and young people, death penalty and the Americas.

except sometimes it's only one type of document, or no restriction at all, or only one keyword, only two results, and so on. There are dozens of permutations, and the proper way of describing the situation is different each time in small but vital ways. This excerpt is as close as I've come without spelling everything out:

my @records = qw(1 2 3 4 5 6); @input::id = qw(id1 id2 id3 id4); @input::type = qw(document person); print summarise(\@input::id, \@input::type, scalar(@records)); exit; sub summarise { my ($ids,$types,$matches) = @_; my $sentence = 'We found '; $sentence .= $matches || 'no'; if (@$types) { foreach my $i (0..$#$types) { if ($i && $i == $#$types) { $sentence .= ($matches > 1) ? ' and ' : ' or '; } elsif ($i) { $sentence .= ', '; } # document types are in the database # with singular and plural forms of their title # but i've skipped that part here $sentence .= qq| <a href="link">$$types[$i]</a>|; } } else { $sentence .= "item"; $sentence .= "s" if ($matches > 1); } $sentence .= ' relevant to '; $sentence .= 'both ' if (@$ids == 2); $sentence .= 'all of ' if (@$ids > 2); foreach my $i (0..$#$ids) { if ($i && $i == $#$ids) { $sentence .= ' and '; } elsif ($i) { $sentence .= ', '; } # keyword titles also looked up from the database really. $sentence .= qq|<a href="link">$$ids[$i]</a>|; } return $sentence; }

If anyone is interested enough to make this more elegant - or just play golf with it - i'd be much obliged.

but my main question: is there a module or project that'll do some of this work for me? CPAN yields a lot of stemming and other mechanisms designed to make words more friendly to computers, but not much designed to make them more friendly to people.

If there isn't any such module, i'd like to start building one. I imagine something extensibly rule-based with a relatively small number of abstract construction mechanisms for common sentence forms, and a vocabulary of prepositions and articles and so on. Ideally swappable into languages other than English, one day. Any views about feasibility or functionality?

Thanks


In reply to natural language sentence construction by thpfft

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (2)
As of 2024-04-26 00:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found