in reply to Re^2: Problem computing GC content
in thread Problem computing GC content
Certainly. Something like the following:
#!/usr/local/bin/perl use feature qw/say/; use strict; use warnings; open my $FASTAFILE, "<", 'human_fold_25.txt' or die "Could not open file: $!"; my $sequence = ' '; my $Ccount = 0; my $Gcount = 0; my $identifier = ' '; while(my $line = <$FASTAFILE>) { chomp $line; next if $line =~ m/^\./; if ($line =~ /^>/) { $identifier = $line; } elsif ($line !~ /^\(/) { $sequence = $line; $Gcount = $Ccount = 0; my $sequencelength = length $sequence; my @nucleotides = split('', $sequence); foreach my $nuc (@nucleotides) { if ($nuc eq 'G') { $Gcount = $Gcount + 1; } elsif ($nuc eq 'C') { $Ccount = $Ccount + 1; } } my $GCcontent = ((($Gcount + $Ccount) / $sequencelength) * 100 +); say "$identifier, $GCcontent"; } } close $FASTAFILE or die "Could not close file: $!";
Note I've made a few other changes to your code, namely:
Your code could still be tightened up further, of course. For instance, the entire block following elsif ($line !~ /^\(/) could be reduced to the following:
... } elsif ($line !~ /^\(/) { my $GCcount =()= ($line =~ m/(G|C)/g); say "$identifier, " , ($GCcount / length($line) * 100); } ...
but since you're still new to Perl, I imagine that wouldn't make as much sense. :)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Problem computing GC content
by zuepinhel (Initiate) on Jul 13, 2014 at 18:13 UTC | |
by AppleFritter (Vicar) on Jul 13, 2014 at 18:16 UTC | |
by zuepinhel (Initiate) on Jul 13, 2014 at 21:20 UTC | |
by AppleFritter (Vicar) on Jul 13, 2014 at 21:56 UTC |