This post very much references Perl and Web Services as background.

The project that I am involved in/responsible for has been extended to allow the provision, to a group of medical students, of a newsgroup-style discussion forum, I guess not unlike PerlMonks, to give them the opportunity to explore and discuss topics they have been recently studying.

As part of this, the web service needs to provide some form of real-time chat (probably the only component of the project that'll end up using Java, at least client-side), allowing the students to ask questions of a panel of experts (university lecturers).

These experts can be selected in one of two ways: either with them being on the teaching staff (trivial to detect), or, in the case of an external expert, by their responses to questions.

This expert detection based on response requires two things: firstly, that the subject of the OP and its replies is known, something I feel would be fairly simple to detect, either by users selecting a category, or by some automated scripting. Secondly, and more challengingly, the responses need to be analysed in some way to allow the system to detect that the replying expert is answering the questions posed to him. Quality control is a fairly large (and currently unresolved) issue here.

I've been investigating a number of ways to try and provide a form of expert detection, the most simple of which appears to be a set of regular expressions, all with weightings, to differentiate between questions and answers (SpamAssassin style). This has flaws, however, and as mentioned before, the quality of replies is not taken into account at all.

Does anyone have any experience with this form of text analysis? Any suggestions that can be offered into methods for carrying this out would be appreciated - as would any thoughts on the quality control issue.

Thanks in advance for your help.


In reply to Expert Detection by Tanalis

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.