basm100 has asked for the wisdom of the Perl Monks concerning the following question:

hi there, I want to count the number of each character that appears in a line. I thought I could do this using transliteration, but its not working. eg: I look through the file line by line:
while(<MYFILE>) { $line=<MYFILE> $Acount =($line=~tr/A//); #to count no. of A's in a line #if I then print $Acount it says 0 even if there are A's in the line ! }
why doesn't this work ? I want the number of A's in the line to be stored in Acount. Thanks for your help, basm100

Replies are listed 'Best First'.
Re: Counting using transliteration
by jmcnamara (Monsignor) on Feb 25, 2002 at 11:34 UTC

    You should do something like this:     $Acount = $line =~ tr/A/A/; However, if you want to count the number of every character on a line you could do something like this:

        $chars{$_}++ for split //, $line;

    Or if you want to filter out a range of characters later:     $chars[ord $_]++ for split //, $line; Set the scope of %chars or @chars as appropriate.

    --
    John.

      Or moderately more efficiently
      $chars[$_]++ for unpack "c*",$line; $chars{$_}++ for unpack "c*",$line;
      Or course unpack returns the ordinal value not the characater..

      Yves / DeMerphq
      --
      When to use Prototypes?

Re: Counting using transliteration
by Ido (Hermit) on Feb 25, 2002 at 12:16 UTC
     tr/A/A/ and tr/A// are the same actually. The problem with your code is that when you use
    while(<FH>){ #The next line is in $_ }
    The next line is automagically assigned to $_, so you don't need $line=<MYFILE> which then grabs the second (..,forth, sixth..) line.. However, if you want to count every char, you should use one of the ways demerphq and jmcnamara showed above, and not tr///..
Re: Counting using transliteration
by Dog and Pony (Priest) on Feb 25, 2002 at 12:17 UTC
    First off, are you sure that the line has any A's in it? Because you have an error in the way you get your lines. Try adding a print "$line\n"; statement inside your loop, and try to open a file with several lines. It will only print every second line, namely line 2, 4, 6 etc. This is because when you do while(<MYFILE>) it assigns a line to $_, and then you assign the next line to $line. Try adding the line print "$_\n"; before the other print statement, and try again - now it will print all the lines.

    The solution is of course to either write the loop like this:

    while( my $line = <MYFILE> ) { ... }
    or to use the special var $_ instead, and remove the assignment to $line. Like so:
    $Acount = tr/A//;

    Hope this will help.


    You have moved into a dark place.
    It is pitch black. You are likely to be eaten by a grue.
      Thanq for the help guys. I don't think I've explained what I'm wanting to do properly. I am wanting to count the number of each type of character in a line, and then (eventually) find a way to print what the most common character was. This should generate a consensus sequence for my multiple alignments. (Im a bioinformatician)

      So for example, say the line is ---ADGRCMNAAAPPRS I want the program to count the number of the different characters: $Acount, $Dcount, $gapcount (which is -) etc. and store the most common letter in my consensus sequence array. Think this is possible ?! Then the program should move onto the next line of the file and repeat the process... Thanks for pointing out I don't need to put $line, I realise now that the line information is stored in $_

      basm100

        while ($l=<DATA>) { %count = (); $count{$1}++ while ($l =~ /(.)/g); $max = 0; $max_key; foreach (keys %count) { $max = $count{$_}, $max_key = $_ if ($count{$_} > $max +); } push @consensus, $max_key; } print "@consensus\n"; __END__ ---ADGRCMNAAAPPRS ASAMNP-PRCM--MWQZ TYYCYLCKLKDUEAOLE $ perl cons.pl A - Y
Re: Counting using transliteration
by trs80 (Priest) on Feb 25, 2002 at 18:12 UTC
    This will give you all the characters including newline, spaces, etc.
    use strict; use Data::Dumper; my %character_hash = (); while (<DATA>) { map { $character_hash{$_}++ } split//,$_; } print Dumper(\%character_hash); __DATA__ AFJIDIKSOIJFKDFS AKDFJIJDFJSF QUEWITYYUERYUIYE ERUOIERTOUTUT