use strict; use warnings; my %freqs; my $story; # Current story name while () { if (/^\[(.*)\]\s*$/) { $story = ucfirst $1; # Force title caps die "Duplicate story title: $story" if exists $freqs{$story}; next; } next unless defined $story; # wait until we have a story title for my $word (/\w+/g) { # Current story counts $word = ucfirst $word; $freqs{$word}{$story}++; $freqs{all}{$story}++; # Total counts $freqs{$word}{total}++; $freqs{all}{total}++; } } # Print title line print "\t", (join "\t", sort keys %{$freqs{all}}), "\n"; # Print table for my $word (sort keys %freqs) { $freqs{$word}{$_} ||= 0 for keys %{$freqs{all}}; printf "$word\t"; print join "\t", join "\t", map $freqs{$word}{$_}, sort keys %{$freqs{$word}}; print "\n"; } __DATA__ [para one] Hello everyone! Just to make everything clear. I need this for a project were I am being graded. However, the project is not about Perl. It is about doing statistics with unstructured data, ie text. I can do this on excel manually, but I think it would be nice to have a code that will generalyze my analysis and algorithm to any collection of texts. The final goal is to create a process to classify text and information, with no human interaction. That is what I being graded on and I do not need help on that. Just perl... counting words in text and stuff like that. Here is my question Given this code [para two] this code takes a multiple pieces of text and creates a table with the words that appear on each story as rows and story titles as columns. Then each "cell" counts the number of times each word appears on each story So for example Story One: Perl is great Story Two: Perl is free perl Story three: Will I learn perl? will return: #### Para one Para two total A 3 2 5 About 2 0 2 Algorithm 1 0 1 Am 1 0 1 Analysis 1 0 1 And 4 2 6 Any 1 0 1 Appear 0 1 1 Appears 0 1 1 As 0 2 2 Be 1 0 1 Being 2 0 2 But 1 0 1 Can 1 0 1 Cell 0 1 1 Classify 1 0 1 Clear 1 0 1 Code 2 1 3 Collection 1 0 1 Columns 0 1 1 Counting 1 0 1 Counts 0 1 1 Create 1 0 1 Creates 0 1 1 Data 1 0 1 Do 2 0 2 Doing 1 0 1 Each 0 4 4 Everyone 1 0 1 Everything 1 0 1 Example 0 1 1 Excel 1 0 1 Final 1 0 1 For 1 1 2 Free 0 1 1 Generalyze 1 0 1 Given 1 0 1 Goal 1 0 1 Graded 2 0 2 Great 0 1 1 Have 1 0 1 Hello 1 0 1 Help 1 0 1 Here 1 0 1 However 1 0 1 Human 1 0 1 I 6 1 7 Ie 1 0 1 In 1 0 1 Information 1 0 1 Interaction 1 0 1 Is 5 2 7 It 2 0 2 Just 2 0 2 Learn 0 1 1 Like 1 0 1 Make 1 0 1 Manually 1 0 1 Multiple 0 1 1 My 2 0 2 Need 2 0 2 Nice 1 0 1 No 1 0 1 Not 2 0 2 Number 0 1 1 Of 1 2 3 On 3 2 5 One 0 1 1 Perl 2 4 6 Pieces 0 1 1 Process 1 0 1 Project 2 0 2 Question 1 0 1 Return 0 1 1 Rows 0 1 1 So 0 1 1 Statistics 1 0 1 Story 0 6 6 Stuff 1 0 1 Table 0 1 1 Takes 0 1 1 Text 3 1 4 Texts 1 0 1 That 4 1 5 The 2 2 4 Then 0 1 1 Think 1 0 1 This 3 1 4 Three 0 1 1 Times 0 1 1 Titles 0 1 1 To 5 0 5 Two 0 1 1 Unstructured 1 0 1 Were 1 0 1 What 1 0 1 Will 1 2 3 With 2 1 3 Word 0 1 1 Words 1 1 2 Would 1 0 1 all 114 63 177