in reply to Re^8: how to use Algorithm::NaiveBayes module
in thread how to use Algorithm::NaiveBayes module

It looks like the sentences in your file are not separated by new lines, or at least not by new lines recognized by the perl script. If you are on a Mac this can happen when you export files from Excel or Filemaker - the new lines are in the Classic Mac format (carriage return) rather than the Unix format (line feed). Try this to see:
my $count = 0; open($fh,"<",$sentence_file) or die "Could not open $sentence_file: $! +"; while (my $sentence = <$fh>) { chomp $sentence; $count++; print "Sentence $count\n"; } close($fh);
If it only counts one sentence then this is your problem. If you use a text editor like BBEdit you can change the format of your files by opening Edit->Document Options and choosing Unix. Alternatively, you can set the line break that Perl sees at the top of your script (before opening the files):
#!/usr/bin/perl use warnings; use Algorithm::NaiveBayes; local $/ = "\015";

Replies are listed 'Best First'.
Re^10: how to use Algorithm::NaiveBayes module
by agnes (Novice) on Apr 29, 2014 at 02:33 UTC
    Thank you so much for your help!! I almost got the result I want. One last question, in my program, I have three categories which are positive, negative and neutral. I use this code to decide which categories a sentence belongs to
    if ( $probability->{positive} > 1/3 ) { print "%positive:$sentence\n"; } if ( $probability->{negative} > 1/3 ) { print "%negative:$sentence\n"; } if ( $probability->{neutral} > 1/3 ) { print "%neutral:$sentence\n"; }
    however, sometimes one sentence may belong to two category, for example, if the probability of positive is 0.34 and the probability of neutral is 0.35, then this sentence will belong to both positive and neutral. I wrote this code to try to solve this problem:
    if( $probability->{positive} > $probability->{negative} > $probability +->{neutral}) { print"%positive:$sentence\n"; } if( $probability->{positive} > $probability->{neutral} > $probabil +ity->{negative}) { print"%positive:$sentence\n"; } if( $probability->{negative} > $probability->{positive} > $probabi +lity->{neutral}) { print"%negative:$sentence\n"; } if( $probability->{negative} > $probability->{neutral} > $probabil +ity->{positive}) { print"%negative:$sentence\n"; } if( $probability->{neutral} > $probability->{positive} > $probabil +ity->{negative}) { print"%positive:$sentence\n"; } if( $probability->{neutral} > $probability->{negative} > $probabil +ity->{positive}) { print"%positive:$sentence\n"; }
    But, when I run my program, the error message is like this:

    syntax error at calculation.pl line 61, near "} >" Execution of calculation.pl aborted due to compilation errors.

    I do not know how to modify the code, can you help me to solve it. Thank you so much!!!
      Glad to see you have persevered. You need to change the if condition to:
      if ($probability->{positive} > $probability->{negative} and $probability->{positive} > $probability->{neutral}) { print "positive: $sentence\n"; } elsif ($probability->{negative} > $probability->{positive} and $probability->{negative} > $probability->{neutral}) { print "negative: $sentence\n"; } elsif ($probability->{neutral} > $probability->{positive} and $probability->{neutral} > $probability->{negative}) { print "neutral: $sentence\n"; }
        Thank you so much! I got the result I need! It won't be successful without your help. Thank you so much for your help!!
          Hi! I am sorry to bother you again. But I have a problem, now, I want to add one function to my program, I want to classify each sentence to these categories: profit, revenue and cost using Algorithm::NaiveBayes module. Is there any way to modify the existing program so that these two classification can implement together? my training data is like this:

          we no believ reason possibl total amount unrecogn tax benefit will signif increa decrea within next 12 months, no believ audit will conclud within next 12 months. negative, profit

          if actual result differ signif estimates, stock-ba compen expen result oper impacted. negative, cost

          and my code is like this:
          #!/usr/bin/perl use warnings; use Algorithm::NaiveBayes; local $/ = "\015"; my $pos_file = '/Users/Agnes/Documents/positive.TXT'; my $neg_file = '/Users/Agnes/Documents/negative.txt'; my $neu_file = '/Users/Agnes/Documents/neutral.txt'; my $categorizer = Algorithm::NaiveBayes->new; my $fh; open($fh,"<",$pos_file) or die "Could not open $pos_file: $!"; while (my $sentence = <$fh>) { chomp $sentence; my @words = split(' ',$sentence); my %positive; $positive{$_}++ for @words; $categorizer->add_instance( attributes => \%positive, label => 'positive'); } close($fh); open($fh,"<",$neg_file) or die "Could not open $neg_file: $!"; while (my $sentence = <$fh>) { chomp $sentence; my @words = split(' ',$sentence); my %negative; $negative{$_}++ for @words; $categorizer->add_instance( attributes => \%negative, label => 'negative'); } close($fh); open($fh,"<",$neu_file) or die "Could not open $neg_file: $!"; while (my $sentence = <$fh>) { chomp $sentence; my @words = split(' ',$sentence); my %neutral; $neutral{$_}++ for @words; $categorizer->add_instance( attributes => \%neutral, label => 'neutral'); } close($fh); $categorizer->train; my $sentence_file = '/Users/Agnes/Documents/2012_10_18stem.txt'; open($fh,"<",$sentence_file) or die "Could not open $sentence_file: $! +"; while (my $sentence = <$fh>) { chomp $sentence; my @words = split(' ',$sentence); my %test; $test{$_}++ for @words; my $probability = $categorizer->predict(attributes => \%test); if ($probability->{positive} > $probability->{negative} and $probability->{positive} > $probability->{neutral}) { print "positive: $sentence\n"; } elsif ($probability->{negative} > $probability->{positive} and $probability->{negative} > $probability->{neutral}) { print "negative: $sentence\n"; } elsif ($probability->{neutral} > $probability->{positive} and $probability->{neutral} > $probability->{negative}) { print "neutral: $sentence\n"; } } close ($fh);
          Thank you so much for you time!!