I'm sorry, I do not understand this part...abit lost here:
for my $key ( keys %hash ) {
if ( $key =~ /^dataS\d\dR\d$/ ) {
print $key, @{ $hash{$key} }, "\n";
}
I want to be able manipulate the columns individually (thus attempting array) so I am not sure how hash can help in this case.
I try running it but there's error which the script keep running without stopping.
Instead, I attempted the matching of the colume names by using loop but I run into errors somewhere.
I have my own code where I did a while loop to find the matching headers:
my $first = 1;
for(my $i = 0; $i < $originalfilecount; $i++)
{
#read in the current file
open CURINFILE, "<$files[$i]" or die "Error couldn't open file $fi
+les[$i]\n";
print "$files[$i]\n";
if($first)
{
#if this is first file, find column locations
my $firstline = <CURINFILE>; #read headerline
chomp $firstline;
my @columns = split (/\t/, $firstline);
my $columncount = 0;
# print "$firstline\n"; #check if print headers correctly
####### Column Headers for ID, TIME #########
while ($columncount <= $#columns && !($columns[$columncount] =
+~ /ID/))
{
$columncount ++;
}
$ID = $columncount;
while ($columncount <= $#columns && !($columns[$columncount] =
+~ /Time/))
{
$columncount ++;
}
$masstimes = $columncount;
while ($columncount <= $#columns && !($columns[$columnco
+unt] =~ /Links/))
{
$columncount ++;
}
$Links = $columncount;
#check if column position is correct (so far it is correct)
print "ID is at column: $ID\n"; #output = 0
print "Time is at column: $masstimes\n"; #output = 1
print "Links is at column: $Links\n"; #output = 33
#DataR Columns (got ERROR here where I can't run script at all if I ad
+d this) #
while($columncount <= $#columns && !(($columns[$columncount] =
+~ /_data/)))
{
$columncount++;
}
$columns[$columncount] =~ /_dataS(\d+)R/;
my $currentReplicateID = $1;
my $currentReplicateCount = 1;
$ctrlStartCol = $columncount++;
while($columncount <= $#columns)
{
$columns[$columncount] =~ /_dataS(\d+)R/;
my $newReplicateID = $1;
if($newReplicateID ne $currentReplicateID)
{
push(@replicateCount, $currentReplicateCount);
$currentReplicateID = $newReplicateID;
$currentReplicateCount = 1;
}
else
{
$currentReplicateCount++;
}
$columncount++;
}
#add the last replicate in
push(@replicateCount, $currentReplicateCount);
###### End of Data Column Headers ####
####### Read remainder of the file ##############
while (<CURINFILE>)
{
#add metabolite ID, MZ, RT to an array
chomp $_;
my @templine = split (/\t/, $_);
push(@tempratio, $templine[$metabolite]);
push(@tempratio, $templine[$masstimes]);
push(@tempratio, $templine[$rt]);
#ERROR
#add intensities from the samples
my $columnIndex = $ctrlStartCol;
for(my $k = 0; $k <= $i; $k++)
{
$columnIndex += $replicateCount[$k];
}
for(my $j = 0; $j < $replicateCount[$i+1]; $j++)
{
push(@tempratio, $templine[$columnIndex+$j]);
}
}
} # end of main if loop
close CURINFILE;
} #end of main for loop
############## Start of output ##################
print "\nWriting output...";
#create a new Directory and open output files and print out those valu
+es from the hash that meet the filtering criteria
#filtering criteria: defaults set at pvalue < 0.05, 0.5 <ratio > 1.5.
+User specified
mkdir "$pathname" or die "Error couldn't create new Directory";
open my $OUT1, ">$pathname/Metabolite ID.txt" or die "error couldn't o
+pen output file";
open my $OUT2, ">$pathname/masstimes.txt" or die "error couldn't open
+output file";
open my $OUT3, ">$pathname/retentiontimes.txt" or die "error couldn't
+open output file";
open my $OUT4, ">$pathname/intensitydata.txt" or die "error couldn't o
+pen output file";
print $OUT1 "$tempratio[0]"; print $OUT2 "$tempratio[1]";
print $OUT3 "$tempratio[2]";
print $OUT4 "$tempratio[3]";
close $OUT1;
close $OUT2;
close $OUT3;
PS: The output only print 1st value for all OUT1 to OUT4 instead all the values in the whole columns...why??
my DataR names are something like this across the columns:
8899_Neg_Rep01_dataS01R01
8889_Neg_Rep02_dataS01R02
8889_Neg_Rep03_dataS01R03
7499_Neg_Rep01_dataS02R01
7499_Neg_Rep02_dataS02R02
7499_Neg_Rep03_dataS02R03
7709_Neg_Rep01_dataS05R01
7709_Neg_Rep02_dataS05R02
7709_Neg_Rep03_dataS05R03
(and so on...)
That's why I attempt a while loop to try match but I think somewhere is wrong? Not sure if it is the right way to do this. |