Eventually, this code will be released with full documentation. One of the first things the documentation would make clear is that matching text does not necessarily mean plagiarism. Instead, the person looking at the text would have to compare the two documents (with the HTML linking I hope to provide) and determine for themselves if plagiarism took place. My software will not be able to tell whether or not someone gave proper credit for a particular passage.

If the above was the only sentence in a 10,000 word document, I wouldn't say it's plagiarism. If that and several other sentences grouped together in one paragraph have a decent match and there's no attribution, then that's something which merits further study. Deciding whether or not plagiarism has occurred is not something software can do. It can merely flag likely candidates and will always have false positives and negatives.

And I'm aware that professors already have software to do this. The free software I've seen is very limited. (One merely does a "longest substring" match.) I'd like to provide free tools for them.

Cheers,
Ovid

New address of my CGI Course.


In reply to Re^2: Brainstorming session: detecting plagiarism by Ovid
in thread Brainstorming session: detecting plagiarism by Ovid

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.