in reply to Re: Comparing strings (exact matches) in LARGE numbers FAST
in thread Comparing strings (exact matches) in LARGE numbers FAST

No, those are not certain categories of sequences. The first file is a sequencing output that could be 25-100 bp. The 2nd file sequences could be a lot of things, biologically. We are looking for certain "patterns" or "motifs" in the sequences.
  • Comment on Re^2: Comparing strings (exact matches) in LARGE numbers FAST

Replies are listed 'Best First'.
Re^3: Comparing strings (exact matches) in LARGE numbers FAST
by bioinformatics (Friar) on Aug 29, 2008 at 22:10 UTC
    If you are scanning for motifs, you should be able to adapt TFBS to do that, as a binding site is a motif itself. On another note, STORM is another software program that is designed to do such searches, and can be integrated with a database (Statistical significance of cis-regulatory modules BMC Bioinformatics 2007, 8:19). I think that officially puts me out of ideas otherwise :)

    Bioinformatics