in reply to Problem computing GC content

Howdy zuepinhel, welcome to the Monastery!

Take a look at your initial while loop. You're assigning to $sequence (and $identifier) in each iteration, so when the loop finishes, $sequence will only contain the last line read, and all the rest of your script will only operate on that.

Try pulling the code following the loop into the loop -- except for closing the file, obviously.

Replies are listed 'Best First'.
Re^2: Problem computing GC content
by zuepinhel (Initiate) on Jul 13, 2014 at 17:46 UTC

    Can you be more precise ? I'm new to this. What do you propose i do ?

      Certainly. Something like the following:

      #!/usr/local/bin/perl use feature qw/say/; use strict; use warnings; open my $FASTAFILE, "<", 'human_fold_25.txt' or die "Could not open file: $!"; my $sequence = ' '; my $Ccount = 0; my $Gcount = 0; my $identifier = ' '; while(my $line = <$FASTAFILE>) { chomp $line; next if $line =~ m/^\./; if ($line =~ /^>/) { $identifier = $line; } elsif ($line !~ /^\(/) { $sequence = $line; $Gcount = $Ccount = 0; my $sequencelength = length $sequence; my @nucleotides = split('', $sequence); foreach my $nuc (@nucleotides) { if ($nuc eq 'G') { $Gcount = $Gcount + 1; } elsif ($nuc eq 'C') { $Ccount = $Ccount + 1; } } my $GCcontent = ((($Gcount + $Ccount) / $sequencelength) * 100 +); say "$identifier, $GCcontent"; } } close $FASTAFILE or die "Could not close file: $!";

      Note I've made a few other changes to your code, namely:

      • use strict;, and use warnings;. These are your friends; they'll catch many errors for you, and you should always use them. As a side effect, I've also declared variables with my where necessary.
      • use feature qw/say/; it's nicer than print if you want to add a newline anyway, IMO.
      • Lexical filehandles ($FASTAFILE vs. FASTAFILE).
      • Three-argument form of open.
      • Error checking for open and close.
      • Slightly more idiomatic handling of lines starting with periods.
      • Whitespace changes, to improve readability.

      Your code could still be tightened up further, of course. For instance, the entire block following elsif ($line !~ /^\(/) could be reduced to the following:

      ... } elsif ($line !~ /^\(/) { my $GCcount =()= ($line =~ m/(G|C)/g); say "$identifier, " , ($GCcount / length($line) * 100); } ...

      but since you're still new to Perl, I imagine that wouldn't make as much sense. :)

        Thank you for you reply but i can't run your code.

      Look through the code you posted and identify where the while loop ends. Only code inside the loop gets executed repeatedly, so move that loop-ending point after all the lines of code that must run once per loop.

      Here are two good exercises for getting started with any programming languages. First, before you even try running a script, make your eyes run it. Read through the lines. See where decisions are, and imagine you are the computer. Choose the if or the else, and run it in your mind. When you see a while or a foreach loop, read through the lines, then back up to the top, and back around a few times. Then imagine you've reached the end of the loop and go on to the rest of the code. This helps give you a feel for what the machine does when it reads your code.

      Second, experiment with very simple throw-away scripts. Try different kinds of loops, put a few print statements in different places so you can see when, and how often, they print out when you run the script. If you type something that's totally non-working, erase it and do something else. No harm done (assuming you don't tell Perl to delete all your files or something like that).

      When you get an error, use Perl Monks' Super Search or Google to see what others have done. 95% of the time you'll find somebody has had your problem before and found a solution.

      (edit: I'm too slow, you've got a detailed reply. I still recommend trying these two exercises often. They'll serve you well.)

        Thank you for your help, but it still doesn't work. I will try to figure it out with the code that you sent me. :)