It is a good idea to normalize each word to lower case and perhaps to stem them, and also to remove words that don't have any effect on the outcome. You then add these hashes to the categorizer:my $positive = { word1 => 2, word2 => 4, word3 => 1, }; my $negative = { word4 => 3, word5 => 1, };
Then, for each of your sentences, you create a hash in a similar fashion and call predict() to find the probable classification of each sentence:my $categorizer = Algorithm::NaiveBayes->new; $categorizer->add_instance( attributes => $positive, label => 'positive'); $categorizer->add_instance( attributes => $negative, label => 'negative'); $categorizer->train;
There is a section in the book Advanced Perl Programming - 2nd Edition entitled "Categorization and Extraction" that shows extended examples of using this module in conjunction with sentence splitters, stopword lists and stemmers.my $sentence1 = { wordA => 2, wordB => 1, }; my $probability = $categorizer->predict(attributes => $sentence1); if ($probability->{'positive'} > 0.5) { # sentence1 probably positive } elsif ($probability->{'negative'} > 0.5) { # sentence1 probably negative }
In reply to Re: how to use Algorithm::NaiveBayes module
by tangent
in thread how to use Algorithm::NaiveBayes module
by agnes
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |