Dear all, I have hundreds of pathways that I want to compare. Some of the pathways have overlapping nodes. I am interested in obtaining node pairs in the shortest pathway for each.

pwy nodes A a b c d e f B a b c

The shortest pathway that abc appear in is pwy B, so I want to output abc as pairwise. Similarily, the shortest pathway that def appear in is A.

desired output

de A ef A ab B ac B bc B

I have tried all sorts of crazy moves with hashes, but now I think that maybe I should read all of the nodes into a hash of pathways and then loop through comparing all of the arrays. I know that right now I am just printing out the next door neighbor nodes,but I also think there must be a better way to do this. thanks

example code

my $in=$ARGV[0] || "pathways.col"; open (IN,$in) or die "cannot open $in\n"; my %HoCplx2ID; my %HoPwyPair; while (my $lines=<IN>){ next if ($lines =~/^#/); next if ($lines =~/^UNIQUE-ID/); chomp $lines; my @cols=split(/\t/,$lines); my $cmplxID=$cols[0]; #print $cmplxID."\n"; my $cmplxNm=$cols[1]; my @restCols=@cols[2..$#cols]; my @cycIDs=grep(/^GCXG-/, @restCols); @cycIDs=grep($_ ne '',@cycIDs); print "cycIDs array\n"; print Dumper(@cycIDs); my $pwySize=scalar(@cycIDs); push (@{$HoCplx2ID{$cmplxID}},@cycIDs); for (my $i=0; $i < ($pwySize-1); $i++){ my $pair =join("\t",$cycIDs[$i],$cycIDs[$i+1]); $HoPwyPair{$pair}{$cmplxID}=$pwySize; } } close(IN); ########## print out pairwise with PA01 locusIDs ###### my $org=$ARGV[1]|| "PA01"; my $outfile="$in.$org.pairwise.nxtNeighb.tab"; #open (OUT,">",$outfile); ### step 1 for each pair find smallest pathway my %HoSmPwy; foreach my $pair (keys %HoPwyPair){ $HoSmPwy{$pair}=100; foreach my $pwy (keys %{$HoPwyPair{$pair}}){ if ($HoPwyPair{$pair}{$pwy} < $HoSmPwy{$pair}) { $HoSmPwy{$pair}=$HoPwyPair{$pair}{$pwy}; } } } print "hash of smallest pathways\n"; #print Dumper(%HoSmPwy); ### step 2 for each pathway, look at each pair if that pwy size = smal +lest pathway , then print ## print "output\n"; foreach my $pwy (keys(%HoCplx2ID)){ my @units=@{$HoCplx2ID{$pwy}}; my $pwySize=scalar(@units); for (my $i=0; $i < ($pwySize-1); $i++){ my $pair =join("\t",$units[$i],$units[$i+1]); if ($pwySize = $HoSmPwy{$pair}) { # print $pair."\n"; } } } ####

In reply to array comparisons by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.