While I wont be offering code, I will offer some advice.

I would set up some options that you could use to define how what kinds of punctuation that you'd allow. Something along the lines of:
1. foo's
2. what in the (foo) did you say?
3. I'm going to "foo" you.
4. What I think of foo: good, bad, ugly
5. Man/Woman, which is it?
6. paragraph 1:

-- what's up?

-- Not much

--well foo want's to get a hold of you

As you can see, there's many options for punctuation and where you want to break, like do you want to traverse across paragraphs.

One question to ask is what it's for.. It almost seems like it'd be easier to just grab a certain number of characters from the text and then continue to the end of whatever word you're on.

For instance, in this sentence(s), just grabbing the first 25 characters, which puts you in the middle of the word 'sentence' and going to the end of that word, including puctuation etc. So one issue is whether or not getting the exact number of words is nessecary or if it's just getting so much of the text and then not munging the end on punctuation. A little clarification on what it's needed for will help you out here.

In reply to Re: Truncating real text by bcole23
in thread Truncating real text by cog

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.