Ok, let me start off by apologizing for my other thread as an anonymous user. For some reason I thought that I was signed in, but perhaps a moderator can erase it (or have it just buried behind every other post).
Regarding my problem, I have two files; one for base position and one for region. My objective is to match every position to the corresponding region in the corresponding chromosome.
Here is one input file, testReg.txt:
chr1 100 159 0 chr1 200 260 0 chr1 500 750 0 chr3 450 700 0 chr4 100 300 0 chr7 350 600 0 chr9 100 125 0 chr11 679 687 0 chr24 100 200 0 chr24 300 400 0
where 1st column is chromosome number, 2nd column is start of region, 3rd column is end of region; and all columns are separated by tab.
Here is other input file, testPos.txt:
chr1 104 104 0 0 + chr1 145 145 0 0 + chr1 205 205 0 0 + chr1 600 600 0 0 + chr3 500 500 0 0 + chr4 150 150 0 0 + chr4 175 175 0 0 + chr7 400 400 0 0 + chr7 550 550 0 0 + chr9 100 100 0 0 + chr11 680 680 0 0 + chr11 681 681 0 0 + chr24 105 105 0 0 + chr24 110 110 0 0 + chr24 350 350 0 0 +
where 1st column is chromosome number, and 2nd column is the base position, with all cols separated by tab as well.
Here is my code that I've completed so far:
#!/usr/bin/perl use warnings; use strict; my $region = 'testReg.txt'; my $position = 'testPos.txt'; my $writeOut = '>>testOut.txt'; open(R,$region) or die "error reading file"; open(OUT,$writeOut) or die "error writing to the file "; open(P, $position) or die "error reading file "; my $rline; my $pline; while ($rline=<R>) { chomp($rline); my @r_arr=split("\t",$rline); chomp($r_arr[0]); my @rID = split("r",$r_arr[0]); $r_arr[0] = $rID[1]; #this removes the "chr" portion of the fi +rst element and leaves number #i.e. instead of [0] -> "chr24"; [0] -> "24" while($pline=<P>) { if(!$rline) { last; } #end if chomp($pline); my @p_arr=split("\t",$pline); chomp($p_arr[0]); my @pID = split("r",$p_arr[0]); $p_arr[0] = $pID[1]; if($p_arr[1]>$r_arr[2]) { $rline=<R>; redo; } #end if else { if($p_arr[0] == $r_arr[0] && $p_arr[1] >= $r_arr[1] +&& $p_arr[1] <= $r_arr[2]) { #NOTE: [0] element in each array now corresponds t +o chr number # r[1] is start of region and r[2] is end of regio +n # p[1] is the position of the base pair shift(@p_arr); print (OUT "chr$r_arr[0]\t$r_arr[1]\t$r_arr[2]\t$r +_arr[3]\t"); print OUT join ("\t", @p_arr), "\n"; #essentially I'm joining the two files with ma +tching lines #w/ columns separated by tab } #end if } #end else } # end while <P> } #end while <R> close R; close P; close OUT;
And below is the output that is produced by my code. As you can see, only the first 2 lines are produced and the output stops there
chr1 100 159 0 104 104 0 0 + chr1 100 159 0 145 145 0 0 +
Below is the output that I would like my code to produce, which is basically just joining testReg.txt and testPos.txt for each match
chr1 100 159 0 chr1 104 104 0 0 + chr1 100 159 0 chr1 145 145 0 0 + chr1 200 260 0 chr1 205 205 0 0 + chr1 500 750 0 chr1 600 600 0 0 + chr3 450 700 0 chr3 500 500 0 0 + chr4 100 300 0 chr4 150 150 0 0 + chr4 100 300 0 chr4 175 175 0 0 + chr7 350 600 0 chr7 400 400 0 0 + chr7 350 600 0 chr7 550 550 0 0 + chr9 100 125 0 chr9 100 100 0 0 + chr11 679 687 0 chr11 680 680 0 0 + chr11 679 687 0 chr11 681 681 0 0 + chr22 100 200 0 chr22 105 105 0 0 + chr22 100 200 0 chr22 110 110 0 0 + chr22 300 400 0 chr22 350 350 0 0 +
I've tried to manipulate my code in a few ways and just end up with inconsistent results. With my current version, it is apparent that the conditional statement is correct, but it appears something is wrong with my while loops because the output stops prematurely.
Any help would be appreciated,
a217
In reply to While loop problem with filereading by a217
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |