in reply to in need of wisdom: handling DNA strings
The key to the solution for this problem (as far as my understanding of the problem goes) is to make use of the fourth argument of the substr function on scalar strings which allows for replacement strings to be inserted - This would allow replacement strings in the DNA strands to be inserted as follows:
substr( $dna, $start + 1, $finish - $start, "X" x ( $finish - $start ) + );
The major problem which you appear to be facing with the existing code is that the $finish is not being defined at any point in your code. Note that the variables @finish and $finish are not equivalent and that scoping issues also come into effect with the code provided - It would be worthwhile your reading of Coping with Scoping written by Dominus.
The following is how I would rewrite the code snippet provided - Note that the following is untested and is provided for illustrative purposes:
my $dna; open DNA, "<dna.file" or die $!; foreach (<DNA>) { chomp; $dna .= $_; } close DNA; open POS, "<positions.input" or die $!; foreach (<POS>) { chomp; my ($start, $finish) = split /\s+/; substr( $dna, $start + 1, $finish - $start, "X" x ( $finish - $sta +rt ) ); print $dna, "\n"; } close POS;
This code snippet differs slightly from that provided in a few ways - Firstly, the manner of populating the $dna string differs in that the above example code populates the DNA scalar string by reading each line of the supplied file, chomping new-line characters and concatenating the input onto the $dna string rather than slurping it into an array first. This code then steps through the positions input file and splices DNA scalar accordingly, replacing introns/extrons with 'X'-strings of identical length using the substr function. In this fashion, the start and finish bases for splicing do not need to be stored beyond their use in the iterative loop.
The snippet of code provided raises a few other questions to my mind, specifically relating to the definition and usage of other variables including $count and $base - Are these variables used elsewhere in your code beyond the bounds of the snippet provided? If not then it appears that these variables are entirely unnecessary in the code snippet provided.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
print $dna each time?
by RMGir (Prior) on Apr 23, 2002 at 13:41 UTC | |
by rob_au (Abbot) on Apr 23, 2002 at 13:52 UTC |