counting occurances

imlou has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: counting occurances by davido (Cardinal) on Sep 19, 2003 at 19:17 UTC
Try this if all you want is a combined total of all the things you're searching for A much more flexible solution is posted in a subsequent followup: `my $count; $count += tr/FIL/FIL/ while <DATA>; print "F, I, and L occurred $count times.\n"; __DATA__ FDIELSIGCOXLSAGICK\n` [download] Update: This approach will give you a combined total count for all the items you're counting. If you want an individual count for each thing you're counting, with the ability to easily add or subtract items from the count list, see my followup post. Therein you will find two examples that both work well. The first (which I prefer) uses `tr///`, and the second uses `index`. I think you'll find those solutions to be much closer to what you need. Hope this helps! Dave "If I had my life to do over again, I'd be a plumber." -- Albert Einstein	[reply] [d/l] [select]
Re: Re: counting occurances by imlou (Sexton) on Sep 19, 2003 at 19:29 UTC
would that give me the occurance of them individually? Or when they are together? I also have to remove the first line in the file because it's header information and I don't need it. I have this so far but its doesn't seem to work. `while(<FILE>){ (s!>.*(\n)!!); my @words = (A, C, G, T); #letters to search my @file = split (/\w/g, <FILE>); #split each letter into an array foreach $file (@file){ foreach $words (@words){ if ($file eq $words){ $word_list{words}++; } } } @pairs = sort {$a->[1] <=> $b->[1]} map {[ $_ => $word_list{$_}]} keys %word_list; print "word $_ ->[0] = $_->[1]\n" for @pairs;` [download] It's been a while since I've used perl, so i'm very very rusty. Thanks	[reply] [d/l]
Re: Re: Re: counting occurances by davido (Cardinal) on Sep 19, 2003 at 20:03 UTC
If you want to count the occurrence of each item independantly, and you want to strip the first line off as a header line, and you want the list of items that you're counting to be easily adjustable, this will do the trick. `my $header_line = <DATA>; my %count; my @chars = ( qw/F G S/ ); while (my $line = <DATA> ) { eval "\$count{$_} += \$line =~ tr/$_/$_/,1" or die $@ foreach @chars; } print "There are $count{$_} occurrences of $_\n" foreach sort keys %count; __DATA__ Sample header hine FDIELSIGCOXLSAGICK\n FDIELSIGCOXLSAGICK\n` [download] The reason that the tr/// must appear inside of an eval block is that variables are not interpolated in tr/// (the transliteration table is built at compiletime, not runtime). Eval forces a fresh compilation of tr/// each time through the loop. The reason that I pass references is because I want the variables to exist as variables inside the eval, not as values (except in the case of what's inside the tr/// itself). And the '1' appears at the end of the eval expression so that eval returns safely (without croaking) even if no matches are found. I think this is an elegant solution, and saves a lot of intricate fiddling. If you want to see a solution that uses `index` instead of `tr///`, you may... Read more... (1223 Bytes) Dave "If I had my life to do over again, I'd be a plumber." -- Albert Einstein	[reply] [d/l] [select]
Re: Re: Re: counting occurances by shenme (Priest) on Sep 19, 2003 at 20:16 UTC
Yet another version, trying for speed like davido's using `tr` , but without losing the speed to eval's just to force `tr` to accept the variables. If this still doesn't go fast enough you'll have to unroll into 4 separate m/// statements. my $temp = <DATA>; print "Discarding the file header line.\n"; print " ", $temp; my @chars = qw( A C G T ); my %cnts; while( <DATA> ) { foreach my $char (@chars) { # my $cnt = () = m/$char/g; # $cnts{$char} += $cnt; # printf " For char '%s' I found %d\n", $char, $cnt; $cnts{$char} += () = m/$char/g; } } foreach my $char (@chars) { printf " Char '%s': %6d\n", $char, $cnts{$char}; } __DATA__ Generated by a completely confused program yesterday ACGTGACTAGAGGCCCGGGGAAAAAAAAAACCCCCCC ACCTGACTAGAGGCCCGGGGAAAAAAAAAACCCCCCC ACGTGACTAGAGGCCCGGGGAAAAAAAAAACCCCCCC AGGTGAGTAGAGGGGGGGGGAAAAAAAAAAGGGGGGG ACGTGACTAGAGGCCCGGGGAAAAAAAAAACCCCCCC [download] Outputs Discarding the file header line. Generated by a completely confused program yesterday Char 'A': 70 Char 'C': 49 Char 'G': 56 Char 'T': 10	[reply] [d/l]
Re: Re: Re: Re: counting occurances by shenme (Priest) on Sep 19, 2003 at 21:40 UTC
Re: Re: Re: counting occurances by qmole (Beadle) on Sep 19, 2003 at 20:03 UTC
Hi there! Just a few mistakes: `<FILE>; # ignore first line my @words = (A, C, G, T); #letters to search my @file = do {local $/; split (//, <FILE>); }; #split each letter int +o an array foreach $file (@file){ foreach $words (@words){ if ($file eq $words){ $word_list{$file}++; } } } @pairs = sort {$a->[1] <=> $b->[1]} map {[ $_ => $word_list{$_}]} keys %word_list; print "word $_->[0] = $_->[1]\n" for @pairs;` [download] You were only `split`ting on the first line of the file, and there was a typo on the last line. The code above will hopefully work as you intended.	[reply] [d/l] [select]