in reply to split $data, $unquoted_value;

I found Text::Sentence that appears to work on your example...

#!/usr/bin/perl -w use strict; use Text::Sentence qw/split_sentences/; use locale; use POSIX qw/locale_h/; setlocale(LC_CTYPE,'iso_8859_1'); my $a = "This is some text. A period (\".\") usually terminates a statement. But not if it's quoted. Regardless of whether or not single quotes, '.', are used. "; my @sentences = split_sentences($a); for my $i (0..$#sentences) { print "sentence #$i: <$sentences[$i]>\n"; + }

When executed generates this

sentence #0: <This is some text.> sentence #1: <A period (".") usually terminates a statement.> sentence #2: <But not if it's quoted.> sentence #3: <Regardless of whether or not single quotes, '.', are use +d.>

Update: Changed "this is some text." to "This is some text." because the module apparently uses capitalization to identify sentence boundaries. So it might not work for you...

Replies are listed 'Best First'.
Re^2: split $data, $unquoted_value;
by Ovid (Cardinal) on Sep 14, 2005 at 21:20 UTC

    Tempting, but as you suspected, it doesn't quite work. I actually need this to be able to better parse Prolog programs.

    Cheers,
    Ovid

    New address of my CGI Course.