Lately, I've been spending a lot of time getting very familiar with XML, specifically with RSS, RDF, and Atom feeds.

To that end, I've written a script that uses Net::NNTP to fetch news articles and creates an RSS feed out of them. From there, I can convert that feed to HTML, which I then convert to a format suitable for display on a Palm handheld device, using Plucker. I do this in two formats because I specifically need it in both RSS and HTML formats simultaneously.

So far, so good.

My question is.. how do I take the body of the news article I receive, and "rewrap" the text, so it fits within a known width? I know about Text::Wrap, but this would require a bit more thinking to get right with the quoted material (custom regexes?).

The body of an article generally has quoted material buried somewhere within it, with '>' at the beginning of the quoted lines. This becomes a problem when the lines are quoting quoted material, like this:

> This is a sentence that might contain some of the original > person's quoted text. Its a first-level quoted mesage. This is a reply to that quoted material >> This is some text from the very first original post that >> wraps onto another line. > And someone here is replying to that original quoted > text. And this is the current poster's.

What I'd like to figure out, is how to rewrap this text, keeping the same kind of aspect, based on the width of the target device (which I will know before I convert it). For example, wrapping the text to a maximum width of 320 pixels, or a maximum width of 160 pixels, and so on.


In reply to Rewrapping Net::NNTP output by hacker

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.