in reply to finding tuples

BioPerl won't help you with this problem, it not being a biological question beyond using residue symbols.
If i understand correctly though, you want the maximum number of 4 character tuples?
Then you could try some kind of search path algorithm, and then compare the different paths for the longest?
You would have to have some way of remembering your searchpath, splitting the path every time you come across somewhere where you could take alternate routes? You have some simple rules already, like if two sequential letters are the same you can either form an 'AAAA' type tuple or an 'ABCD' type, etc...

There is a lot of info on search path / graph search algorithms in wikipedia etc... e.g. http://en.wikipedia.org/wiki/Dijkstra's_algorithm you just want the opposite of what they usually do...

I would be interested in seeing what you come up with

Just a something something...

Replies are listed 'Best First'.
Re^2: finding tuples
by Anonymous Monk on Jun 24, 2009 at 08:00 UTC
    you want the maximum number of 4 character tuples?

    Not quite. The maximum number is not interesting, except the tuples make up the whole set. In other words: either a set cannot be completely decomposed, then it has no solution; or in can be completely decomposed in certain ways, then those are the solutions.

    I have read the material about search paths, but I cannot see how it relates to a set. Can you tell me how I make search path out of a set?

      I don't have much experience in this, but i think the first thing you would need to do is work out some kind of tuple set descriptor, probably the positions of the letters if you broke them down into an array. e.g.
      $paths = [$path1, $path2 ... $pathN,];

      With each path being:
      $path = [[1,2,3,4,],[5,8,9,11],[6,10,12,13,],[etc...]...];

      You would then need a way of continuing the search path from a search path stub (i.e. a searchpath that has reached a breakpoint). After that it is a case of being really clear on what could cause a breakpoint...

      I am not being very helpful here, but it is a tough question! I suspect you may need to fork the job out as you go ( Parallel::ForkManager and Introduction to Parallel::ForkManager ) and recombine their output ( IPC::Shareable )

      Just a something something...