Re: Reporting entries in a file

If I understand rigth you want to count the number of occurrencies of every distinct "word" in your file.

To do so you should extract the "words" from the file and use a hash (not an array) to count the occurrencies. At the beginning the hash is w empty; for every word you extract, you check if defined($hash{$word}): if it is defined then you increment the value, else you put the value to one.

Careful with that hash Eugene.

Comment on Re: Reporting entries in a file Download Code

Replies are listed 'Best First'.
Re^2: Reporting entries in a file by SkullOne (Acolyte) on May 30, 2008 at 22:11 UTC
Thank you, a hash is exactly what I needed, not an array. I saw an example online, and modified it a tad: `#!/usr/bin/perl while (<>) { @words = split(/\n+/); foreach $word (@words) { $count{$word}++; } } foreach $word (sort by_count keys %count) { print "$word \: $count{$word}\n"; } sub by_count { $count{$b} <=> $count{$a}; }` [download]	[reply] [d/l]
Re^3: Reporting entries in a file by alexm (Chaplain) on May 30, 2008 at 22:26 UTC
`@words = split(/\n+/);` I guess you meant: `@words = split(/\s+/);`	[reply] [d/l] [select]
Re^3: Reporting entries in a file by SkullOne (Acolyte) on May 30, 2008 at 22:13 UTC
Correction, the hash would've worked fine, but ended up using an array again.	[reply]
Re^2: Reporting entries in a file by rovf (Priest) on Jun 02, 2008 at 09:00 UTC
From what SkullOne said ("It's just a one-column file"), s/he had only one word per line, so the task is even easier (we need only to count the different types of line - maybe after stripping leading and trailing spaces). The concept of using a hash would be the same, but we don't need to split the line into words. A solution which would do with arrays instead of hashes and is easy to code too, provided that we don't need stripping of the spaces, would be to slurp the whole file into a array, sort the array, and then (in a loop through the array), count the number of consecutive equal entries. This would get us an alphabetically sorted list of the words with associated count. -- Ronald Fischer <ynnor@mm.st>	[reply]