in reply to how to use Algorithm::NaiveBayes module

The Synopsis section of the Algorithm::NaiveBayes docs seems fairly straight-forward. While I haven't tested the code below, I believe it should work as intended.

use strict; use warnings; use Algorithm::NaiveBayes; use Data::Dumper; $| = 1; $Data::Dumper::Deepcopy = 1; $Data::Dumper::Sortkeys = 1; my %training_files = ( positive => q{./training-positive.txt}, negative => q{./training-negative.txt}, neutral => q{./training-neutral.txt}, ); my @test_files = ( q{./001-test.txt}, q{./002-test.txt}, ); my $nb = Algorithm::NaiveBayes->new( purge => 0, ); foreach my $k ( keys %training_files ) { local $/; open my $inf, q{<}, $training_files{$k} or die $!; my $line = <$inf>; close $inf; $nb->add_instance( attributes => str_to_array( $line ), label => [ $word ], ); } $nb->train; foreach my $tf ( @test_files ) { local $/; open my $inf, q{<}, $tf or die $!; my $line = <$inf>; close $inf; my $result = $nb->predict( attributes => str_to_array( $line ), ); print qq{Prediction: $result - $tf\n}; } sub str_to_array { my ($str) = @_; my %attr; foreach my $word ( split /\s|[\(\)!?.,:;]/, $str ) { $attr{$word}++; } return \%attr; }

Hope that helps.

Replies are listed 'Best First'.
Re^2: how to use Algorithm::NaiveBayes module
by agnes (Novice) on Apr 24, 2014 at 19:59 UTC
    Hi! Thank you for your code again. Can you please tell me the function of Data::Dumper? if I want to classify each training sentence in to certain category, such as: revenue, cost, profit and so on, since the same word will have different tone in different environment, for example: the word increase would be positive, if it appears in the sentence about revenue, but it would be negative, if it appears in the sentence about cost. How can I modify the code to implement the function? Your reply will be helpful to me. Thank you!
Re^2: how to use Algorithm::NaiveBayes module
by agnes (Novice) on Apr 23, 2014 at 03:55 UTC
    Thank you so much! I will try this code and report my result to you!! Have a good night!!
Re^2: how to use Algorithm::NaiveBayes module
by agnes (Novice) on Apr 24, 2014 at 21:09 UTC
    I have run the code, the result is like this:

    Prediction: HASH(0x7f8f540c4c00) - /Users/Agnes/Documents/process_sentence.txt

    Prediction: HASH(0x7f8f540c2380) - /Users/Agnes/Documents/test.txt

    my code is as follow:
    #!/usr/bin/perl use Algorithm::NaiveBayes; use Data::Dumper; $| = 1; $Data::Dumper::Deepcopy = 1; $Data::Dumper::Sortkeys = 1; my %training_files = ( positive => q{/Users/Agnes/Documents/positive.txt}, negative => q{/Users/Agnes/Documents/negative.txt}, neutral => q{/Users/Agnes/Documents/neutral.txt}, ); my @test_files = ( q{/Users/Agnes/Documents/process_sentence.txt}, q{/Users/Agnes/Documents/test.txt}, ); my $nb = Algorithm::NaiveBayes->new( purge => 0, ); foreach my $k ( keys %training_files ) { local $/; open my $inf, q{<}, $training_files{$k} or die $!; my $line = <$inf>; close $inf; $nb->add_instance( attributes => str_to_array( $line ), label => [ $word ], ); } $nb->train; foreach my $tf ( @test_files ) { local $/; open my $inf, q{<}, $tf or die $!; my $line = <$inf>; close $inf; my $result = $nb->predict( attributes => str_to_array( $line ), ); print qq{Prediction: $result - $tf\n}; } sub str_to_array { my ($str) = @_; my %attr; foreach my $word ( split /\s|[\(\)!?.,:;]/, $str ) { $attr{$word}++; } return \%attr; }
    I s there any problem in it?