in reply to Making a hash with groups of IDs

This is the fourth in a series of questions relating to what appears to be the same project. Maybe it is time to stand back a little and describe your overall project rather than have us continually trying to guess what you are trying to achieve and squeezing information out of you a single small drop at a time?

So far we have some information concerning the format of a couple of input files. We know that at least one of these is big. We know that you are selecting some data based on some other data. We know there is a third file involved.

We don't know what you are trying to achieve in a "big picture" way. We don't know if this is a one off. If this is not a one off we don't know how the input data changes over time. We don't know if you need to perform multiple searches with the same data.

You seem to focus on answering a few of the questions you've been asked and you seem to be looking for a quick fix solution to what is probably a small part of the problem. The more we know about the high level problem the more we can offer ways to address the big issues.

True laziness is hard work

Replies are listed 'Best First'.
Re^2: Making a hash with groups of IDs
by jemswira (Novice) on Feb 08, 2012 at 13:00 UTC

    So my entire project is to use LIBSVM to predict a protein's function. What I have been doing, is taking the Protein database's list of protein numbers (accession numbers, the 6 digit things) and matching them to their PF numbers, which is in what PF groups. The two files i have been using are up there. Now I have managed to get the data in the following format:

    B3T3Y0 | PF02517.11 B3T4D5 | PF13371.1 PF13369.1 B3T4G0 | PF13607.1 B3T516 | PF08438.5 B3T517 | PF13207.1 PF13238.1 B3T644 | PF14382.1 B3T662 | PF13248.1 B3T663 | PF13248.1 PF13248.1 PF13240.1

    which is what i actually wanted all along. Thanks Count Zero. What i don't know is what to do to put it into LIBSVM, but my groupmate is looking into that. Thanks everyone though!