in reply to split $data, $unquoted_value;
I found Text::Sentence that appears to work on your example...
#!/usr/bin/perl -w use strict; use Text::Sentence qw/split_sentences/; use locale; use POSIX qw/locale_h/; setlocale(LC_CTYPE,'iso_8859_1'); my $a = "This is some text. A period (\".\") usually terminates a statement. But not if it's quoted. Regardless of whether or not single quotes, '.', are used. "; my @sentences = split_sentences($a); for my $i (0..$#sentences) { print "sentence #$i: <$sentences[$i]>\n"; + }
When executed generates this
sentence #0: <This is some text.> sentence #1: <A period (".") usually terminates a statement.> sentence #2: <But not if it's quoted.> sentence #3: <Regardless of whether or not single quotes, '.', are use +d.>
Update: Changed "this is some text." to "This is some text." because the module apparently uses capitalization to identify sentence boundaries. So it might not work for you...
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: split $data, $unquoted_value;
by Ovid (Cardinal) on Sep 14, 2005 at 21:20 UTC |