Actually, The idea behind parsing this data file is a biological concept and is to extract "orthologous sequences". Orthologous sequences are sequences which belong to different species and have a common homologue exactly in the common ancestor of both species. For example, a pair with sequence id's: NP_01-NP_02 && NP_02-NP_01 is a stable pair of orthologous sequences in two species 01 and 02.
In case of three species pairwise comparisons, So for NP_01, NP_02, and NP_03, there could be 3 possible stable pairs of orthologous sequences. And this is why I needed to know if it is present in all or not if not then it cannot be defined as an "ortholog set" or an orthologous sequence present in all three species.
I really appreciate your help. But you know what nothing is easy, as I was going through my datafile, I found that there are pairwise comparisons between more than three species, sometimes even 13 species (see sample below) and that makes it even more complicated..
added section of datafile NP_08 NP_09 NP_08 NP_10 NP_08 NP_11 NP_08 NP_12 NP_08 NP_13 NP_09 NP_10 NP_09 NP_11 NP_09 NP_13 NP_12 NP_13
I did not show the reciprocals of the pairs (that exist) just for the convenience..
Thank you again!
In reply to Re^4: help needed in modifying the code for counting possible combinations
by BhariD
in thread help needed in modifying the code for counting possible combinations
by BhariD
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |