in reply to Reporting entries in a file

If I understand rigth you want to count the number of occurrencies of every distinct "word" in your file.

To do so you should extract the "words" from the file and use a hash (not an array) to count the occurrencies. At the beginning the hash is w empty; for every word you extract, you check if defined($hash{$word}): if it is defined then you increment the value, else you put the value to one.

Careful with that hash Eugene.

Replies are listed 'Best First'.
Re^2: Reporting entries in a file
by SkullOne (Acolyte) on May 30, 2008 at 22:11 UTC
    Thank you, a hash is exactly what I needed, not an array.
    I saw an example online, and modified it a tad:
    #!/usr/bin/perl while (<>) { @words = split(/\n+/); foreach $word (@words) { $count{$word}++; } } foreach $word (sort by_count keys %count) { print "$word \: $count{$word}\n"; } sub by_count { $count{$b} <=> $count{$a}; }
      @words = split(/\n+/);

      I guess you meant: @words = split(/\s+/);

      Correction, the hash would've worked fine, but ended up using an array again.
Re^2: Reporting entries in a file
by rovf (Priest) on Jun 02, 2008 at 09:00 UTC

    From what SkullOne said ("It's just a one-column file"), s/he had only one word per line, so the task is even easier (we need only to count the different types of line - maybe after stripping leading and trailing spaces). The concept of using a hash would be the same, but we don't need to split the line into words.

    A solution which would do with arrays instead of hashes and is easy to code too, provided that we don't need stripping of the spaces, would be to slurp the whole file into a array, sort the array, and then (in a loop through the array), count the number of consecutive equal entries. This would get us an alphabetically sorted list of the words with associated count.

    -- 
    Ronald Fischer <ynnor@mm.st>