in reply to Re: counting occurances
in thread counting occurances

would that give me the occurance of them individually? Or when they are together? I also have to remove the first line in the file because it's header information and I don't need it. I have this so far but its doesn't seem to work.

while(<FILE>){ (s!>.*(\n)!!); my @words = (A, C, G, T); #letters to search my @file = split (/\w/g, <FILE>); #split each letter into an array foreach $file (@file){ foreach $words (@words){ if ($file eq $words){ $word_list{words}++; } } } @pairs = sort {$a->[1] <=> $b->[1]} map {[ $_ => $word_list{$_}]} keys %word_list; print "word $_ ->[0] = $_->[1]\n" for @pairs;
It's been a while since I've used perl, so i'm very very rusty. Thanks

Replies are listed 'Best First'.
Re: Re: Re: counting occurances
by davido (Cardinal) on Sep 19, 2003 at 20:03 UTC
    If you want to count the occurrence of each item independantly, and you want to strip the first line off as a header line, and you want the list of items that you're counting to be easily adjustable, this will do the trick.

    my $header_line = <DATA>; my %count; my @chars = ( qw/F G S/ ); while (my $line = <DATA> ) { eval "\$count{$_} += \$line =~ tr/$_/$_/,1" or die $@ foreach @chars; } print "There are $count{$_} occurrences of $_\n" foreach sort keys %count; __DATA__ Sample header hine FDIELSIGCOXLSAGICK\n FDIELSIGCOXLSAGICK\n

    The reason that the tr/// must appear inside of an eval block is that variables are not interpolated in tr/// (the transliteration table is built at compiletime, not runtime). Eval forces a fresh compilation of tr/// each time through the loop.

    The reason that I pass references is because I want the variables to exist as variables inside the eval, not as values (except in the case of what's inside the tr/// itself).

    And the '1' appears at the end of the eval expression so that eval returns safely (without croaking) even if no matches are found.

    I think this is an elegant solution, and saves a lot of intricate fiddling.

    If you want to see a solution that uses index instead of tr///, you may...

    Dave

    "If I had my life to do over again, I'd be a plumber." -- Albert Einstein

Re: Re: Re: counting occurances
by shenme (Priest) on Sep 19, 2003 at 20:16 UTC
    Yet another version, trying for speed like davido's using   tr   , but without losing the speed to eval's just to force   tr   to accept the variables.   If this still doesn't go fast enough you'll have to unroll into 4 separate m/// statements.
    my $temp = <DATA>; print "Discarding the file header line.\n"; print " ", $temp; my @chars = qw( A C G T ); my %cnts; while( <DATA> ) { foreach my $char (@chars) { # my $cnt = () = m/$char/g; # $cnts{$char} += $cnt; # printf " For char '%s' I found %d\n", $char, $cnt; $cnts{$char} += () = m/$char/g; } } foreach my $char (@chars) { printf " Char '%s': %6d\n", $char, $cnts{$char}; } __DATA__ Generated by a completely confused program yesterday ACGTGACTAGAGGCCCGGGGAAAAAAAAAACCCCCCC ACCTGACTAGAGGCCCGGGGAAAAAAAAAACCCCCCC ACGTGACTAGAGGCCCGGGGAAAAAAAAAACCCCCCC AGGTGAGTAGAGGGGGGGGGAAAAAAAAAAGGGGGGG ACGTGACTAGAGGCCCGGGGAAAAAAAAAACCCCCCC
    Outputs
    Discarding the file header line.
        Generated by a completely confused program yesterday
      Char 'A':      70
      Char 'C':      49
      Char 'G':      56
      Char 'T':      10
    
      Since we're spewing bilge today, thought I'd post what I'd been thinking to do to avoid the RE recompiles.   I know others have done it (many times) before, but ...
      my $magic = 'sub { '; foreach my $char (@chars) { $magic .= "\$cnts{$char} += ()= \$_[0] =~ m/$char/g;"; } $magic .= '}'; my $magicsub = eval "$magic"; die "eval to create anon sub failed '$@'" unless defined $magicsub; while( <DATA> ) { $magicsub->($_); }

      And in honor of the day:
      @pieces=split //, 'eight'; print "@pieces squawk!\n" for(1..2);
Re: Re: Re: counting occurances
by qmole (Beadle) on Sep 19, 2003 at 20:03 UTC
    Hi there!

    Just a few mistakes:

    <FILE>; # ignore first line my @words = (A, C, G, T); #letters to search my @file = do {local $/; split (//, <FILE>); }; #split each letter int +o an array foreach $file (@file){ foreach $words (@words){ if ($file eq $words){ $word_list{$file}++; } } } @pairs = sort {$a->[1] <=> $b->[1]} map {[ $_ => $word_list{$_}]} keys %word_list; print "word $_->[0] = $_->[1]\n" for @pairs;


    You were only splitting on the first line of the file, and there was a typo on the last line. The code above will hopefully work as you intended.