Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
Dear Monks,
another newbie here, trying to make sense of the hashes and how to best use them (if this is what I need) for my following problem:
Assume the following file, where each 'entry' has 3 lines, namely:
Now, what I want is to store the unique entries, and, by unique in my case i define the ones that have the same id_2 and sequence_of_chars. The label_of_chars does not matter much, as it will only vary a little bit if the other 2 lines are the same. The only change (and I don't care which one I keep of those) is the id_1, where I can have multiple ones. Example below:
Now, from the example above, the desired output would be any of the 4kt0_M, 6uzv_m, 5oy0_m or 6hqb_M and then |P72986, the sequence MALSDTQILAALVVALLPAFLAFRLSTELYK below this and any of the 4 available labels. Is hashes the way to go? I can split the line starting with > and store each of the 4 elements into variables, but I don't know how to proceed from there.
another newbie here, trying to make sense of the hashes and how to best use them (if this is what I need) for my following problem:
Assume the following file, where each 'entry' has 3 lines, namely:
>id_1|id_2 sequence_of_chars label_of_chars
Now, what I want is to store the unique entries, and, by unique in my case i define the ones that have the same id_2 and sequence_of_chars. The label_of_chars does not matter much, as it will only vary a little bit if the other 2 lines are the same. The only change (and I don't care which one I keep of those) is the id_1, where I can have multiple ones. Example below:
>4kt0_M|P72986 MALSDTQILAALVVALLPAFLAFRLSTELYK iiiiiiiiiMMMMMMMMMMMMMMMMMIIIII >6uzv_m|P72986 MALSDTQILAALVVALLPAFLAFRLSTELYK iiiiiiiiiiiiMMMMMMMMMMMMMMMMMII >5oy0_m|P72986 MALSDTQILAALVVALLPAFLAFRLSTELYK iiiiiiiiiMMMMMMMMMMMMMMMMMIIIII >6hqb_M|P72986 MALSDTQILAALVVALLPAFLAFRLSTELYK iiiiiiiiiiiMMMMMMMMMMMMMMIIIIII
Now, from the example above, the desired output would be any of the 4kt0_M, 6uzv_m, 5oy0_m or 6hqb_M and then |P72986, the sequence MALSDTQILAALVVALLPAFLAFRLSTELYK below this and any of the 4 available labels. Is hashes the way to go? I can split the line starting with > and store each of the 4 elements into variables, but I don't know how to proceed from there.
Back to
Seekers of Perl Wisdom