in reply to Truncating Last Sentence
See the rindex function's docs. It returns the last location of the substring (here, ".") in the string ($str). We're telling it to start looking at the 1000th character (and work backwards).if (length($str) > 1000) { substr($str, 1+rindex($str, '.', 1000)) = ""; }
If you want to allow various punctuation, might I suggest my String::Index module? It's faster than the typical regex solution and a hybrid regex/substr solution.
String::Index gives you four functions that are crosses between Perl's index() function and C's strpbrk() function.#!/usr/bin/perl use Benchmark 'cmpthese'; use String::Index 'crindex'; my $str = "alphabet. alphabet! alphabet? " x 100; cmpthese(-5, { rcindex => sub { my $x = $str; substr($x, 1+crindex($str, ".!?", 1000)) = ""; }, regex => sub { my $x = $str; $x =~ s/^(.{1,999}[.!?]).*/$1/; }, rxsubstr => sub { my $x = $str; $x =~ /^.{1,999}[.!?]/ and substr($x, $+[0]) = ""; }, }); __END__ Rate regex rxsubstr rcindex regex 42520/s -- -43% -66% rxsubstr 75202/s 77% -- -40% rcindex 125559/s 195% 67% --
(I need to fix the docs or the module a tad. The function is 'crindex', but I have 'rcindex' somewhere.)
|
|---|