in reply to Re^2: sorting text into sentences.
in thread sorting text into sentences.

This will do it for you.

Sample text file:
C:\Temp>type t.txt This is a test sentence. And this is another one; Also so is this. And by the way, this is the fourth sentence.
Now the perl code:
#!/usr/bin/perl use strict; use warnings; my $s; my @arr; open(FILE, "<t.txt"); while(<FILE>) { chomp $_; $s .= $_; } @arr = $s =~ m/[A-Z].+?[.;]/g; foreach (@arr) { print $_, "\n"; }
Now the output:
C:\Temp>t.pl This is a test sentence. And this is another one; Also so is this. And by the way, this is the fourth sentence.
as you can see, each array position in @arr contains 1 sentence (as you have defined it).

hope this helps,

davidj

Replies are listed 'Best First'.
Re^4: sorting text into sentences.
by chiburashka (Initiate) on Sep 06, 2004 at 08:12 UTC
    Thanks a lot, but i already made a code :
    #!/usr/bin/perl -w use Strict; $dat = "a.txt"; open(DAT, "$dat") || die "Can't open the file.\n"; @a=<DAT>; close(DAT); my $temp3; foreach (@a) {chomp $_; $temp3 .= "$_ "} @a = split(/.,;/, $temp3); foreach (@a) {$_ .= "\n";} print @a;
    "If you know the right question to ask, you already know the answer."
      From your description, you want split(/[.;]/, not /.,;/. And I don't see where capital letters would be checked.
        nevermind, thanks, that'll do :)

        p.s: i remembered that i don't check the dots and dot-apostroph themselves, so split(/.,;/, $watever) will do just fine.

      By the way, the proper invocation is "use strict;". On case-insensitive file-systems, "use Strict;" will unfortunately not complain, but won't do anything either.