in reply to Rewrapping Net::NNTP output

Text::Autoformat works great for this very purpose. I use it to rewrap e-mail messages that are heavily quoted. It can be heavily customized for your specific application.

Replies are listed 'Best First'.
Re: Rewrapping Net::NNTP output
by hacker (Priest) on Feb 25, 2005 at 04:05 UTC
    I was aaaaaaalmost sold, until I tried it on a usenet post that was already wrapped incorrectly.. observe a snippet from comp.text.pdf:
    > 3. Scripting Languages, such as Python or Perl > =============================================== > > ReportLab Toolkit (Python) and PDF::API2 (Perl). > > 4. Lower-Level Programming languages such as C > =============================================== > > Look at PDFlib lite (simple version of the commercial one, not for commercial > use!) and ClibPDF. You will need a C compiler and some experienced C programmers > though.

    This will be reflowed to the specified width, using Text::Autoformat, but the broken lines aren't cuddled back up to their previous lines before reflowing the text. It looks like this:

    > 3. Scripting Languages, such as > Python or Perl > =============================================== > > ReportLab Toolkit (Python) and > PDF::API2 (Perl). > > 4. Lower-Level Programming languages > such as C > =============================================== > > Look at PDFlib lite (simple version > of the commercial one, not for commercial > use!) and ClibPDF. You will need a C > compiler and some experienced C programmers > though.

    The right-column is wrapped to the right width, but the text is still broken up. I wish there was a way to avoid this kind of behavior.

    I also tried Text::Reform and Text::Reflow with similar (negative) results.

    How are you handling cases like this in your code? You seem to be doing something similar to what I'm doing here also.

      Text::Autoformat isn't perfect. I'm not using Text::Autoformat in a completely automated situation. I'm able to manually intervene and do any final tweaks to the text. Regardless, it makes my life much easier.

      In fact, I'm not sure it can be perfect due to ambiguity with how many of us write. For example, given the following text:

      > this text should be considered a single paragraph > that was hard wrapped such that not every line has a quote character.
      We would want it to be formatted as:
      > this text should be considered a single > paragraph that was hard wrapped such > that not every line has a quote character.
      However, given this text:
      > (Sir Galahad approaches the Bridgekeeper) Stop! What is your name? Sir Galahad of Camelot. > What is your quest? I seek the Grail. > What is your favorite color? Blue. No yellow...
      we would want it to be formatted as:
      > (Sir Galahad approaches the Bridgekeeper) > Stop! What is your name? Sir Galahad of Camelot. > What is your quest? I seek the Grail. > What is your favorite color? Blue. No yellow...
      and not:
      > (Sir Galahad approaches the Bridgekeeper) > Stop! What is your name? > Sir Galahad of Camelot. > What is your quest? > I seek the Grail. > What is your favorite color? > Blue. No yellow...
      How is Text::Autoformat to tell the difference?

      geoff