Where you have print "%positive\n"; you should have print "positive: $sentence\n"; and the same for negative and neutral. | [reply] [d/l] [select] |
#!/usr/bin/perl
use warnings;
use Algorithm::NaiveBayes;
my $pos_file = '/Users/Agnes/Documents/positive.TXT';
my $neg_file = '/Users/Agnes/Documents/negative.txt';
my $neu_file = '/Users/Agnes/Documents/neutral.txt';
my $categorizer = Algorithm::NaiveBayes->new;
my $fh;
open($fh,"<",$pos_file) or die "Could not open $pos_file: $!";
while (my $sentence = <$fh>) {
chomp $sentence;
my @words = split(' ',$sentence);
my %positive;
$positive{$_}++ for @words;
$categorizer->add_instance(
attributes => \%positive,
label => 'positive');
}
close($fh);
open($fh,"<",$neg_file) or die "Could not open $neg_file: $!";
while (my $sentence = <$fh>) {
chomp $sentence;
my @words = split(' ',$sentence);
my %negative;
$negative{$_}++ for @words;
$categorizer->add_instance(
attributes => \%negative,
label => 'negative');
}
close($fh);
open($fh,"<",$neu_file) or die "Could not open $neg_file: $!";
while (my $sentence = <$fh>) {
chomp $sentence;
my @words = split(' ',$sentence);
my %neutral;
$neutral{$_}++ for @words;
$categorizer->add_instance(
attributes => \%neutral,
label => 'neutral');
}
close($fh);
$categorizer->train;
my $sentence_file = '/Users/Agnes/Documents/process_sentence.txt';
open($fh,"<",$sentence_file) or die "Could not open $sentence_file: $!
+";
while (my $sentence = <$fh>) {
chomp $sentence;
my @words = split(' ',$sentence);
my %test;
$test{$_}++ for @words;
my $probability = $categorizer->predict(attributes => \%test);
if ( $probability->{positive} > 1/3 ) {
print "%positive:$sentence\n";
}
if ( $probability->{negative} > 1/3 ) {
print "%negative:$sentence\n";
}
if ( $probability->{neutral} > 1/3 ) {
print "%neutral:$sentence\n";
}
}
close($fh);
# if ( $probability->{negative} > 1/3 ) {
#print "%negative\n";
#}
#if ( $probability->{neutral} > 1/3 ) {
# print "%neutral\n";
#}
I do not know what should I do next. It will be very kind of you to help me to solve this problem. Thank you!! | [reply] [d/l] |
It looks like the sentences in your file are not separated by new lines, or at least not by new lines recognized by the perl script. If you are on a Mac this can happen when you export files from Excel or Filemaker - the new lines are in the Classic Mac format (carriage return) rather than the Unix format (line feed). Try this to see:
my $count = 0;
open($fh,"<",$sentence_file) or die "Could not open $sentence_file: $!
+";
while (my $sentence = <$fh>) {
chomp $sentence;
$count++;
print "Sentence $count\n";
}
close($fh);
If it only counts one sentence then this is your problem. If you use a text editor like BBEdit you can change the format of your files by opening Edit->Document Options and choosing Unix. Alternatively, you can set the line break that Perl sees at the top of your script (before opening the files):
#!/usr/bin/perl
use warnings;
use Algorithm::NaiveBayes;
local $/ = "\015";
| [reply] [d/l] [select] |