Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
Dear all, I have hundreds of pathways that I want to compare. Some of the pathways have overlapping nodes. I am interested in obtaining node pairs in the shortest pathway for each.
pwy nodes A a b c d e f B a b c
The shortest pathway that abc appear in is pwy B, so I want to output abc as pairwise. Similarily, the shortest pathway that def appear in is A.
desired output
de A ef A ab B ac B bc B
I have tried all sorts of crazy moves with hashes, but now I think that maybe I should read all of the nodes into a hash of pathways and then loop through comparing all of the arrays. I know that right now I am just printing out the next door neighbor nodes,but I also think there must be a better way to do this. thanks
example code
my $in=$ARGV[0] || "pathways.col"; open (IN,$in) or die "cannot open $in\n"; my %HoCplx2ID; my %HoPwyPair; while (my $lines=<IN>){ next if ($lines =~/^#/); next if ($lines =~/^UNIQUE-ID/); chomp $lines; my @cols=split(/\t/,$lines); my $cmplxID=$cols[0]; #print $cmplxID."\n"; my $cmplxNm=$cols[1]; my @restCols=@cols[2..$#cols]; my @cycIDs=grep(/^GCXG-/, @restCols); @cycIDs=grep($_ ne '',@cycIDs); print "cycIDs array\n"; print Dumper(@cycIDs); my $pwySize=scalar(@cycIDs); push (@{$HoCplx2ID{$cmplxID}},@cycIDs); for (my $i=0; $i < ($pwySize-1); $i++){ my $pair =join("\t",$cycIDs[$i],$cycIDs[$i+1]); $HoPwyPair{$pair}{$cmplxID}=$pwySize; } } close(IN); ########## print out pairwise with PA01 locusIDs ###### my $org=$ARGV[1]|| "PA01"; my $outfile="$in.$org.pairwise.nxtNeighb.tab"; #open (OUT,">",$outfile); ### step 1 for each pair find smallest pathway my %HoSmPwy; foreach my $pair (keys %HoPwyPair){ $HoSmPwy{$pair}=100; foreach my $pwy (keys %{$HoPwyPair{$pair}}){ if ($HoPwyPair{$pair}{$pwy} < $HoSmPwy{$pair}) { $HoSmPwy{$pair}=$HoPwyPair{$pair}{$pwy}; } } } print "hash of smallest pathways\n"; #print Dumper(%HoSmPwy); ### step 2 for each pathway, look at each pair if that pwy size = smal +lest pathway , then print ## print "output\n"; foreach my $pwy (keys(%HoCplx2ID)){ my @units=@{$HoCplx2ID{$pwy}}; my $pwySize=scalar(@units); for (my $i=0; $i < ($pwySize-1); $i++){ my $pair =join("\t",$units[$i],$units[$i+1]); if ($pwySize = $HoSmPwy{$pair}) { # print $pair."\n"; } } } ####
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: array comparisons
by McA (Priest) on Oct 27, 2014 at 16:06 UTC |