di has asked for the wisdom of the Perl Monks concerning the following question:
Deep in the throes of mystical experience, I am come to the monastery to share my wonder. The following script counts the occurrences of distinct words in a list. (The list of thirty words here found has been pared down from a much larger text normally input from a file, but the effect to be described occurs the same on either form of input.) The mystery resides first in the word "adjuster", which is counted from the list as two separate words with a frequency of 2 and 1. Delete only the first word from the list and lo "adjuster" appears as one with a frequency of 3. Delete only the second word from the list and the same occurs. But delete the third or fourth word and again "adjuster" appears as two! And yet again, delete together words three through nine and "adjuster" metapmorphoses back to one being three in number! How much deeper this may go I do not know but present this mystery in confidence of forthcoming enlightenment.
use strict; use warnings; #open IN, "<list.txt"; #open OUT, ">count.txt"; my @words = qw(this and led cain had been indwelt by an adjuster had a +lways been disdainful he and when he had sought divine assistance an +adjuster indwelt and this adjuster gave cain); my @counted =(); while (@words) { my $key = shift @words; my $count = 1; foreach my $y (0..$#words) { if (not $words[$y]) {next} # Avoids absent members resultant +from splice. if ($words[$y] eq $key) { $count++; splice (@words, $y, 1); } } push @counted, "$key\t$count\n"; } print sort @counted;
Pax vobiscum - and thanks!
Update: ikegami's alternative approach provided amazing benefit. A run that was taking over twenty minutes was reduced to under twenty seconds! And, not unimportantly, it produced accurate output! Many thanks.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Identical list members are sometimes counted as different when compared to each other.
by ikegami (Patriarch) on Jan 02, 2010 at 17:12 UTC | |
by di (Acolyte) on Jan 03, 2010 at 14:19 UTC | |
|
Re: Identical list members are sometimes counted as different when compared to each other.
by ikegami (Patriarch) on Jan 02, 2010 at 17:01 UTC |