Word Frequency counter

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Word Frequency counter by GrandFather (Saint) on Oct 02, 2008 at 20:30 UTC
For a start `while (@thisfile) {` [download] doesn't do what you think. In particular, it is not `for (@thisfile) {` [download] Next: `$seen{b} <=> $seen{a}` [download] compares the values for keys 'a' and 'b', not for the two sort variables' ($a and $b) contents as you are hoping. Cleaning those problems up, fixing a few other style issues and providing some sample data gives: use strict; use warnings; my $fileData = <<DATA; Greetings, I've been trying to put a word frequency counter together but I keep g +etting an unitialised value in pattern match. I'd be grateful if somebody could +let me know what I'm dong wrong. I added a line to get some repeated words. DATA my %seen; open my $inFile, '<', \$fileData; for (grep {chomp; length} <$inFile>) { $seen{lc $1}++ while /(\w['\w-]*)/g; } close ($inFile); printf "%5d %s\n", $seen{$_}, $_ for sort { $seen{$b} <=> $seen{$a} } +keys %seen; [download] Prints: `2 a 2 to 2 i 1 i've 1 know 1 put 1 if 1 unitialised 1 greetings 1 i'd 1 frequency 1 wrong 1 let 1 could 1 in 1 keep 1 line 1 repeated 1 trying 1 what 1 value 1 me 1 match 1 grateful 1 i'm 1 word 1 be 1 some 1 somebody 1 but 1 added 1 words 1 dong 1 been 1 get 1 together 1 getting 1 pattern 1 counter 1 an` [download] Perl reduces RSI - it saves typing	[reply] [d/l] [select]
Re^2: Word Frequency counter by Anonymous Monk on Oct 03, 2008 at 07:53 UTC
Thanks for all the above replies which explain where I've gone wrong and why and especially Grandfather for showing a different and less verbose way of getting the task working.	[reply]
Re: Word Frequency counter by Fletch (Bishop) on Oct 02, 2008 at 20:16 UTC
You've used literal barewords "a" and "b" as hash keys in your sort comparitor where you wanted to be using the variables `$a` and `$b`. The cake is a lie. The cake is a lie. The cake is a lie.	[reply] [d/l] [select]
Re: Word Frequency counter by toolic (Bishop) on Oct 02, 2008 at 20:23 UTC
Probably unrelated to your problem, but your outer while loop will be infinite if your `@thisfile` array has any contents. It would be better to use a for loop instead: `for (@thisfile) {` [download]	[reply] [d/l] [select]
Re^2: Word Frequency counter by DrWhy (Chaplain) on Oct 02, 2008 at 20:32 UTC
Actually, this is related. Anonymous' regex is trying to match against `$_`, but while loops (and everything else in this code) don't set `$_`, so it's undefined. If you change the outer `while` to a `for` then you have something that sets `$_`. Update: Complete gibberish fixed. Now it says what I meant to say --DrWhy "If God had meant for us to think for ourselves he would have given us brains. Oh, wait..."	[reply]
Re: Word Frequency counter by apl (Monsignor) on Oct 02, 2008 at 20:34 UTC
... and, as always, you should `use strict; use warnings;`	[reply] [d/l]
Re: Word Frequency counter by planetscape (Chancellor) on Oct 03, 2008 at 12:10 UTC
Depending on your needs, you might also find modules such as Ted Pedersen's Ngram Statistics Package package useful. HTH, planetscape	[reply]
Re: Word Frequency counter by Lawliet (Curate) on Oct 02, 2008 at 20:21 UTC
Update: ~This is not the reply you are looking for~ My original reply was removed due to embarrassment. (Read OP too quickly, whoops.) I'm so adjective, I verb nouns! chomp; # nom nom nom	[reply]