Re: unique sequences

Hi Anonymous Monk,

it is difficult to guess what you should be obtaining without seeing the input, but the output get is in line with the code you've shown. Your code is basically discarding the "comments" (that's how I call the lines starting with >, for lack of a better description) and then looks for sequences of ten nucleotides (I hope this is the right term) followed by GG. And that's pretty much what you have in your output. So, to me, you get what you ask for.

Please explain in plain English what you need to extract and in which respect the output you get is not what you want or need.

As a side note, it may or may not be relevant or important, but please remember that a hash does not preserve the order in which the data were populated into it.

Comment on Re: unique sequences