in reply to Re: Problem computing GC content
in thread Problem computing GC content

Can you be more precise ? I'm new to this. What do you propose i do ?

Replies are listed 'Best First'.
Re^3: Problem computing GC content
by AppleFritter (Vicar) on Jul 13, 2014 at 18:02 UTC

    Certainly. Something like the following:

    #!/usr/local/bin/perl use feature qw/say/; use strict; use warnings; open my $FASTAFILE, "<", 'human_fold_25.txt' or die "Could not open file: $!"; my $sequence = ' '; my $Ccount = 0; my $Gcount = 0; my $identifier = ' '; while(my $line = <$FASTAFILE>) { chomp $line; next if $line =~ m/^\./; if ($line =~ /^>/) { $identifier = $line; } elsif ($line !~ /^\(/) { $sequence = $line; $Gcount = $Ccount = 0; my $sequencelength = length $sequence; my @nucleotides = split('', $sequence); foreach my $nuc (@nucleotides) { if ($nuc eq 'G') { $Gcount = $Gcount + 1; } elsif ($nuc eq 'C') { $Ccount = $Ccount + 1; } } my $GCcontent = ((($Gcount + $Ccount) / $sequencelength) * 100 +); say "$identifier, $GCcontent"; } } close $FASTAFILE or die "Could not close file: $!";

    Note I've made a few other changes to your code, namely:

    • use strict;, and use warnings;. These are your friends; they'll catch many errors for you, and you should always use them. As a side effect, I've also declared variables with my where necessary.
    • use feature qw/say/; it's nicer than print if you want to add a newline anyway, IMO.
    • Lexical filehandles ($FASTAFILE vs. FASTAFILE).
    • Three-argument form of open.
    • Error checking for open and close.
    • Slightly more idiomatic handling of lines starting with periods.
    • Whitespace changes, to improve readability.

    Your code could still be tightened up further, of course. For instance, the entire block following elsif ($line !~ /^\(/) could be reduced to the following:

    ... } elsif ($line !~ /^\(/) { my $GCcount =()= ($line =~ m/(G|C)/g); say "$identifier, " , ($GCcount / length($line) * 100); } ...

    but since you're still new to Perl, I imagine that wouldn't make as much sense. :)

      Thank you for you reply but i can't run your code.

        What does that mean? Are you getting an error message? If so, which?
Re^3: Problem computing GC content
by Bethany (Scribe) on Jul 13, 2014 at 18:10 UTC

    Look through the code you posted and identify where the while loop ends. Only code inside the loop gets executed repeatedly, so move that loop-ending point after all the lines of code that must run once per loop.

    Here are two good exercises for getting started with any programming languages. First, before you even try running a script, make your eyes run it. Read through the lines. See where decisions are, and imagine you are the computer. Choose the if or the else, and run it in your mind. When you see a while or a foreach loop, read through the lines, then back up to the top, and back around a few times. Then imagine you've reached the end of the loop and go on to the rest of the code. This helps give you a feel for what the machine does when it reads your code.

    Second, experiment with very simple throw-away scripts. Try different kinds of loops, put a few print statements in different places so you can see when, and how often, they print out when you run the script. If you type something that's totally non-working, erase it and do something else. No harm done (assuming you don't tell Perl to delete all your files or something like that).

    When you get an error, use Perl Monks' Super Search or Google to see what others have done. 95% of the time you'll find somebody has had your problem before and found a solution.

    (edit: I'm too slow, you've got a detailed reply. I still recommend trying these two exercises often. They'll serve you well.)

      Thank you for your help, but it still doesn't work. I will try to figure it out with the code that you sent me. :)

        ... it still doesn't work.

        And "it doesn't work" is still not a sufficient description of the problem. Again, how does it fail; what specific error messages do you see? (Also: Please let us know the version of Perl you are using.) (Of course, you may simply not have the time or inclination right now to pursue this matter. For some strange reason, you may want to do something with your day off other than spend it in the lab pounding on Gs and Cs. Well, there's no accounting for taste...)

        Okay, good luck! Keep with it, things get easier once you've got past that first learning curve.

        By the way, I'm not the monk who posted the code for you to try, that was AppleFritter. I'm new here myself.