BhariD has asked for the wisdom of the Perl Monks concerning the following question:
I have the following input file. The input file include pairs of two id's separated by tab. Each line with a pair is a combination and has relevance.
DATA
YP_01 NP_02
NP_02 YP_01
YP_01 NP_03
NP_03 YP_01
NP_02 NP_03
NP_03 NP_02
NP_04 NP_05
.....
Here is the code which I am using to get the reciprocal best hit pair from this input file. For example, in above input file, YP_01\tNP_02 and NP_02\tYP_01, is reciprocal best hit pair. And NP_04\tNP_05 is not reciprocal hit pair as it lacks the reciprocal combination in the file i.e. NP_05\tNP_04.
open (DATA, $results) || die "couldn't open the file!"; my %unique = (); while( <DATA> ){ chomp; $unique{ join( "\t", sort split /\t/ ) } ++; } my @pairs = (); my @not_pairs = (); for my $item (sort keys %unique ){ if( $unique{$item} > 1 ){ push @pairs, $item; }else{ push @not_pairs, $item; } } print join( "\n", @pairs ), "\n";
This gives the following result file
YP_01 NP_02
YP_01 NP_03
NP_02 NP_03
The problem is now I need to modify the code such that for above result file it produces the following result file instead
YP_01 NP_02
YP_01 NP_03
3
The idea is if YP_01 is paired with NP_02, and NP_03 and if NP_02 is paired with NP_03, then I do not need the pair NP_02 and NP_03 printed in the output file as their info already there in combination with YP_01. But I do need to know whether that pair NP_02 and NP_03 existed or not and that is why I need towards the end a #. As shown above I have "3" here in the end representing that the three best hit pairs are present (YP_01 and NP_02; YP_01 and NP_03; NP_02 and NP_03). If NP_02 and NP_03 was not present, then the number should change to 2 showing that out of 3 only 2 comparison are there.
could anyone help me with this. please let me know if you need further clarification
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: help needed in modifying the code for counting possible combinations
by GrandFather (Saint) on Oct 30, 2009 at 22:34 UTC | |
by BhariD (Sexton) on Oct 31, 2009 at 14:40 UTC | |
by GrandFather (Saint) on Oct 31, 2009 at 22:15 UTC | |
by BhariD (Sexton) on Nov 01, 2009 at 01:36 UTC | |
|
Re: help needed in modifying the code for counting possible combinations
by gman (Friar) on Oct 30, 2009 at 20:16 UTC | |
|
Re: help needed in modifying the code for counting possible combinations
by Anonymous Monk on Oct 30, 2009 at 19:15 UTC |