in reply to Truncating real text

sub trunc { @_ == 2 or return; my ($max_words, $text) = @_; $text =~ /(\W*(?:\w+\W*){1,$max_words})/ and return $1; } my $str = "there once was a man from Nantucket who kept all his cash i +n a bucket"; print trunc(5, $str), "\n";
Update: this will grab any punctuation (non-word, non-whitespace) attached to words, and strip leading whitespace (though trailing whitespace stays attached):
/((?:[^\w\s]*\w+[^\w\s]*\s*){1,$max_words})/

Caution: Contents may have been coded under pressure.