Counting using transliteration

basm100 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Counting using transliteration by jmcnamara (Monsignor) on Feb 25, 2002 at 11:34 UTC
You should do something like this: `$Acount = $line =~ tr/A/A/;` However, if you want to count the number of every character on a line you could do something like this: `$chars{$_}++ for split //, $line;` Or if you want to filter out a range of characters later: `$chars[ord $_]++ for split //, $line;` Set the scope of `%chars` or `@chars` as appropriate. -- John.	[reply] [d/l] [select]
Re: Re: Counting using transliteration by demerphq (Chancellor) on Feb 25, 2002 at 12:05 UTC
Or moderately more efficiently `$chars[$_]++ for unpack "c",$line; $chars{$_}++ for unpack "c",$line;` [download] Or course unpack returns the ordinal value not the characater.. Yves / DeMerphq -- When to use Prototypes?	[reply] [d/l]
Re: Counting using transliteration by Ido (Hermit) on Feb 25, 2002 at 12:16 UTC
`tr/A/A/` and `tr/A//` are the same actually. The problem with your code is that when you use `while(<FH>){ #The next line is in $_ }` [download] The next line is automagically assigned to $_, so you don't need `$line=<MYFILE>` which then grabs the second (..,forth, sixth..) line.. However, if you want to count every char, you should use one of the ways demerphq and jmcnamara showed above, and not tr///..	[reply] [d/l] [select]
Re: Counting using transliteration by Dog and Pony (Priest) on Feb 25, 2002 at 12:17 UTC
First off, are you sure that the line has any A's in it? Because you have an error in the way you get your lines. Try adding a `print "$line\n";` statement inside your loop, and try to open a file with several lines. It will only print every second line, namely line 2, 4, 6 etc. This is because when you do `while(<MYFILE>)` it assigns a line to `$_`, and then you assign the next line to `$line`. Try adding the line `print "$_\n";` before the other print statement, and try again - now it will print all the lines. The solution is of course to either write the loop like this: `while( my $line = <MYFILE> ) { ... }` [download] or to use the special var `$_` instead, and remove the assignment to `$line`. Like so: `$Acount = tr/A//;` [download] Hope this will help. You have moved into a dark place. It is pitch black. You are likely to be eaten by a grue.	[reply] [d/l] [select]
Re: Re: Counting using transliteration by basm100 (Novice) on Feb 25, 2002 at 14:06 UTC
Thanq for the help guys. I don't think I've explained what I'm wanting to do properly. I am wanting to count the number of each type of character in a line, and then (eventually) find a way to print what the most common character was. This should generate a consensus sequence for my multiple alignments. (Im a bioinformatician) So for example, say the line is ---ADGRCMNAAAPPRS I want the program to count the number of the different characters: $Acount, $Dcount, $gapcount (which is -) etc. and store the most common letter in my consensus sequence array. Think this is possible ?! Then the program should move onto the next line of the file and repeat the process... Thanks for pointing out I don't need to put $line, I realise now that the line information is stored in $_ basm100	[reply]
Re: Re: Re: Counting using transliteration by zengargoyle (Deacon) on Feb 26, 2002 at 01:33 UTC
`while ($l=<DATA>) { %count = (); $count{$1}++ while ($l =~ /(.)/g); $max = 0; $max_key; foreach (keys %count) { $max = $count{$_}, $max_key = $_ if ($count{$_} > $max +); } push @consensus, $max_key; } print "@consensus\n"; __END__ ---ADGRCMNAAAPPRS ASAMNP-PRCM--MWQZ TYYCYLCKLKDUEAOLE $ perl cons.pl A - Y` [download]	[reply] [d/l]
Re: Counting using transliteration by trs80 (Priest) on Feb 25, 2002 at 18:12 UTC
This will give you all the characters including newline, spaces, etc. `use strict; use Data::Dumper; my %character_hash = (); while (<DATA>) { map { $character_hash{$_}++ } split//,$_; } print Dumper(\%character_hash); __DATA__ AFJIDIKSOIJFKDFS AKDFJIJDFJSF QUEWITYYUERYUIYE ERUOIERTOUTUT` [download]	[reply] [d/l]