Text::ParagraphDiff is a new module made to perform diffs on paragraphed/plain text. (This means that it is different than the gnu diff, which finds differences on record-oriented data.) Text::ParagraphDiff finds the difference of two documents word-by-word, puts the differences back into context of the original document, and produces a pleasant xHTML output.

How does it work?

Text::ParagraphDiff works like a normal diff, but in a different way. First, each set of input (old and new) is expanded so that each word is its own "record". Next, Algorithm::Diff is used to find the difference of the two expanded record sets. Finally, the original text and the differences are streamlined together, and formatted into xHTML.

What is it good for?

Text::ParagraphDiff is ideal for online author-publisher relationships. For instance, you might author an article for some e-zine. After the first revision, you send it in to the ezine's editor to be "edited" before publication. A few days later, you might realize that a few sentences aren't quite clear enough, and should be fixed. However, by this time, the editor has been busy making changes all over the place. Instead of angering the editor by sending in a new version of the article, you could send a nice diff showing exactly where the changes were made. (and thus making everyone happy :)

The output of a Text::ParagraphDiff diff also comes beefed with a few extra features. Two JavaScript-powered buttons are created that allow the display of the differences to be toggled. Click "toggle minus" to view the new document by hiding the red bits, or "toggle plus" to view the old document by hiding the green bits. Finally, the aesthetics of the output are enhanced by a clean CSS layout. If you don't want any of this, that's ok too. The output is completely customizable - see the docs for more details.

Where can I get it?

You can always find the latest version on the CPAN.
For a feel of what Text::ParagraphDiff can do, take a look at this preview. (and the source text files, old.txt and new.txt)


In reply to Text::ParagraphDiff by jryan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.