paola82 has asked for the wisdom of the Perl Monks concerning the following question:
Hi all, I want to create a generic script for comparing two file of different datasets with same number of columns and rows. I created a script that work fine for my datasets, because I knew 'a priori' the number of columns, so I pushed every row in a array, parsed it and assign variables. open foreach for the first and cycle on the second, but I want to automatically assign variable for every element "every variable column for each rows.
example data set 1
ligs 1j1b 1j1c 1o9u 1pyx 1q3d 1q3w 1q4l 1q41 + 1q5k 1r0e 1uv5 2o5k 2ow3 3du8 3f7z 3f88 3gb +2 3i4b 3l1s 2jld 1gng 1h8f 1i09 AMP_PNP 2.81 3.28 2.15 2.50 2.62 4.56 3.10 3.0 +4 4.21 3.24 3.59 2.40 3.64 4.64 3.43 3.86 +2.06 2.13 2.64 2.88 4.13 3.26 2.74 ADP 1.21 1.61 2.83 0.87 1.65 3.80 2.91 1.93 + 3.70 2.81 2.15 2.32 1.49 1.44 1.31 2.71 1.62 + 1.23 2.06 2.09 3.71 2.85 2.04 ADZ 3.56 3.59 3.52 0.89 3.40 1.82 4.08 3.42 + 0.88 3.93 4.15 3.88 4.90 1.93 4.05 1.08 3.33 + 1.00 1.02 3.87 3.55 3.42 1.80
ex dataset 2
ligs 1j1b 1j1c 1o9u 1pyx 1q3d 1q3w 1q4l 1q41 + 1q5k 1r0e 1uv5 2o5k 2ow3 3du8 3f7z 3f88 3gb +2 3i4b 3l1s 2jld 1gng 1h8f 1i09 AMP_PNP 3.739 2.796 3.314 1.962 2.024 4.109 3.251 + 2.917 2.738 2.561 4.219 2.313 3.377 3.352 2. +996 4.885 1.825 2.282 4.033 3.288 4.488 2.426 + 2.409 ADP 2.317 3.333 3.125 2.514 2.847 3.94 3.542 1 +.908 3.977 2.062 2.297 2.599 2.013 2.759 3.282 + 3.181 1.59 2.21 3.14 2.239 3.945 3.29 1.861 ADZ 3.733 3.848 4.219 0.858 5.187 4.344 4.586 +3.391 0.919 4.354 4.247 4.192 5.043 1.843 1.971 + 1.081 3.386 1.003 0.98 4.307 3.536 4.049 2.40 +4
my code
#!/usr/bin/perl use warnings; use strict; my $lig;my $a;my $b;my $c;my $d;my $e;my $f;my $g;my $h;my $i;my $l;my + $m;my $n;my $o;my $p;my $q;my $r;my $s;my $t;my $u;my $v;my $w;my $x +;my $y; my $lig2;my $a2;my $b2;my $c2;my $d2;my $e2;my $f2;my $g2;my $h2;my $i +2;my $l2;my $m2;my $n2;my $o2;my $p2;my $q2;my $r2;my $s2;my $t2;my $ +u2;my $v2;my $w2;my $x2;my $y2; my $a3;my $b3;my $c3;my $d3;my $e3;my $f3;my $g3;my $h3;my $i3;my $l3; +my $m3;my $n3;my $o3;my $p3;my $q3;my $r3;my $s3;my $t3;my $u3;my $v3 +;my $w3;my $x3;my $y3; my @array1;my @array2;my $reader="";my $reader2=""; my $file1=$ARGV[0]|| die "type file1 and file2 and output"; my $file2=$ARGV[1]; my $file3=$ARGV[2]; open my $out, ">$file3"; open my $in,"<$file1"; while ($reader=<$in>){push @array1, $reader;} close $in; open my $in2,"<$file2"; while ($reader2=<$in2>){push @array2, $reader2;} close $in2; print $out $array1[0]; foreach my $element(@array1){ if ($element ne $array1[0]){ if ($element=~ /^(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*? +)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.* +?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\n/){ $lig=$1;$a=$2;$b=$3;$c=$4;$d=$5;$e=$6;$f=$7;$g=$8;$h=$9;$i=$10;$l=$11; +$m=$12;$n=$13;$o=$14;$p=$15;$q=$16;$r=$17;$s=$18;$t=$19;$u=$20;$v=$21 +;$w=$22;$x=$23;$y=$24; foreach my $element2(@array2){ if ($element2 ne $array2[0]){ if ($element2=~ /^(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.* +?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(. +*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\t(.*?)\n/){ $lig2=$1;$a2=$2;$b2=$3;$c2=$4;$d2=$5;$e2=$6;$f2=$7;$g2=$8;$h2=$9;$i2=$ +10;$l2=$11;$m2=$12;$n2=$13;$o2=$14;$p2=$15;$q2=$16;$r2=$17;$s2=$18;$t +2=$19;$u2=$20;$v2=$21;$w2=$22;$x2=$23;$y2=$24; if ($lig eq $lig2){ $a3=$a-$a2;$b3=$b-$b2;$c3=$c-$c2;$d3=$d-$d2;$e3=$e-$e2;$f3=$f-$f2;$g3= +$g-$g2;$h3=$h-$h2;$i3=$i-$i2;$l3=$l-$l2;$m3=$m-$m2;$n3=$n-$n2;$o3=$o- +$o2;$p3=$p-$p2;$q3=$q-$q2;$r3=$r-$r2;$s3=$s-$s2;$t3=$t-$t2;$u3=$u-$u2 +;$v3=$v-$v2;$w3=$w-$w2;$x3=$x-$x2;$y3=$y-$y2; print $out "$lig\t$a3 $b3 $c3 $d3 $e3 $f3 $g3 $h3 + $i3 $l3 $m3 $n3 $o3 $p3 $q3 $r3 $s3 $t +3 $u3 $v3 $w3 $x3 $y3\n"; }}}}}}} close $out;
it works but I want to make it generic. could anyone help me? cheers, Paola
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: compare two data sets (matrix)
by Ratazong (Monsignor) on Apr 15, 2010 at 09:02 UTC | |
|
Re: compare two data sets (matrix)
by almut (Canon) on Apr 15, 2010 at 09:54 UTC | |
|
Re: compare two data sets (matrix)
by moritz (Cardinal) on Apr 15, 2010 at 09:10 UTC | |
|
Re: compare two data sets (matrix)
by youlose (Scribe) on Apr 15, 2010 at 10:05 UTC |