Sorry I should have said that in the last nested for loop I am *for now* only printing for testing. This is where all the maths will be which is all written in a $list[$a][$b] basis, I'm just printing here for testing. The rest of the code works, It's just that i now need to make it work for a combination of multiple files, rather than on couples of files like I did so far. That's why Dumper \@list isn't suitable.
Basically what I want to know is: is there a way to open those files with two for loops like so
for ($i; $i<=4; $i++) {
for ($j; $j<=4; $j++) {
#etc
while <lines_of_both_$i_and_$j_files> {
#etc
This was why I tried to use while <>: extensive googling told me that using @ARGV and the diamond operator was the most efficient way to open multiple files and read them line by line with while. I have a perfectly working script that does all the maths I need but unfortunately it's only doing it for while <$line>. How do I tell perl that I want this done for while <lines_of_both_$i_and_$j_files>? I guess that's what my question boils down to.
I'm sorry I'm really struggling with this, getting really stressed and frustrated that I'm constantly buggering it up and can't even explain properly. I am very grateful for all contributions here and I'm reading and studying all of them, however not understanding everything. Quite disheartening as I've been "coding" on and off for a couple years now so expected to have learnt more | [reply] [d/l] [select] |
#!/bin/perl/
use strict;
use warnings;
my $molec1 = "molec1";
my $molec2 = "molec2";
my $path = "/store/group/comparisons";
my @matrix;
my @files;
for my $i (1..3){
push @files, $path."/File-$molec1-cluster$i.out"
}
for my $j (1..2){
push @files, $path."/File-$molec2-cluster$j.out"
}
my @all = ();
for my $filename (@files){
open my $fh,'<',$filename or die "Could not open $filename";
my @matrix = ();
while (my $line = <$fh>){
next if ($line =~ /^#/);
chomp $line;
push @matrix,[split /\s+/,$line];
}
close $fh;
push @all,\@matrix;
}
my $matrices = scalar @all;
for my $i (0..$matrices-1){
print $files[$i]."\n";
my $rows = scalar @{$all[$i]};
for my $r (0..$rows-1){
my $cols = scalar @{$all[$i][$r]};
for my $c (0..$cols-1){
print "$all[$i][$r][$c] "
}
print "\n";
}
print "\n";
}
poj | [reply] [d/l] |
Basically what I want to know is: is there a way to open those files with two for loops like so
for ($i; $i<=4; $i++) {
for ($j; $j<=4; $j++) {
#etc
Yes, you can do that, but that would be very inefficient and that's most probably not what you should do, because it would mean opening the second series of files a number of times, and there is nothing in what you've described that would make this necessary.
With the code that I have provided in my first post (including the small corrections I made on the @ARGV array that I had forgot to remove in the second for loop), you should be able to read all the files.
If, on the other hand, you want to combine in some ways files from your first set with files of your second set, then it is more complicated, but you still don't want to read the same files many times over. But the bottom line is that there is nothing in what you said so far that indicates something in this direction.
| [reply] [d/l] |
"Yes, you can do that, but that would be very inefficient and that's most probably not what you should do, because it would mean opening the second series of files a number of times"
Oh, would it? So the for loop would open the second series of files as many times as specified in the first series of files?
"If, on the other hand, you want to combine in some ways files from your first set with files of your second set, then it is more complicated, but you still don't want to read the same files many times over. But the bottom line is that there is nothing in what you said so far that indicates something in this direction."
Yes, I do want to combine them. As I wrote in an earlier reply, I want to calculate the averages and the deviation between all matrices from files $i and files $j. The end result would be i) another matrix but this time with the average values or ii) a single value, for the deviation. So if I have 5 $i files and 5 $j files, I want to get all their RMS deviations. Sorry, it's probably my fault that this got lost and not mentioned earlier.
I was thinking that the most efficient way to ask my question (and at this point I apologise for the inconvenience of my frantic stressed posts so far) is to post the code that I *have* been using so far which was used only on 2 files at a time. That way I can highlight my efforts so far -and their shortcomings- to deal with multiple files instead of two, and that way you can see the whole code and get an idea of the bigger picture. I was trying to minimise the code so that it would be more efficient to read but I'm not conveying the end goal and ended up confusing people. Our server is down at the minute and will be live again tomorrow so I can't retrieve my existing script right now but hopefully if I post tomorrow there will still be some interest from you very kind folk and I will still get some feedback.
Thank you for the efforts so far, I will continue trying to implement your recommendations.
If anyone missed it as I wrongly didn't mention it efficiently enough, let me explain again what this was all about
My data files are plain text files files that contain text and numbers stored in matrices, they look like this:
#title line - (skipped it with $nextUnless)
#title line - (skipped it with $nextUnless)
1 2 3
4 5 6
7 8 9
they're not necessarily 5 lines long, this is an example. The actual matrices are much bigger, I think my biggest file is a 85x85 matrix.
What I want to do is perform some mathematical calculations on the combination of matrices $i and $j. Get the averages of those matrices, their deviation etc. I haven't included this bit of the code yet but (fingers crossed) it works.
So I want to calculate the deviation between all matrices from file $i and files $j. The end result would be i) another matrix but this time with the average values or ii) a single value, for the deviation. But like I said this bit of the code isn't shown here as I tried to keep it to a minimum - if I fail in opening and splitting them in columns then I won't be able to move on to the next bit anyway.
At this stage I was trying to print them for testing only but this was were all the maths are in the real code, written in a $list[$a][$b] basis, and this was why I was trying to print $list[$a][$b] successfully in the fist place.
In reality this:
for ($a=0; $a<=$#columns; $a++) {
for ($b=0; $b<=$#columns; $b++) {
print "$list[$a][$b] "; # for testing
}
print " \n";
}
Would be:
for ($a=0; $a<=$#columns; $a++) {
for ($b=0; $b<=$#columns; $b++) {
$m_avrg[$a][$b] += $list[$a][$b];
}
}
to start calculating the averages, here I would just be adding the numbers and further on I'd divide to get the average. That's why it's important that I keep the $list[$a][$b] notation as this is where I've based the whole code - which took me weeks/months to write, as you can probably guess.
| [reply] [d/l] [select] |