shakehands has asked for the wisdom of the Perl Monks concerning the following question:

This node falls below the community's threshold of quality. You may see it by logging in.

Replies are listed 'Best First'.
Re: genetic algorithm for motif finding
by marto (Cardinal) on Aug 13, 2013 at 08:42 UTC
Re: genetic algorithm for motif finding
by bioinformatics (Friar) on Aug 13, 2013 at 21:31 UTC

    The original paper for FMGA (finding motifs by genetic algorithm) is here: link. It gives a decent breakdown of the algorithm and provides pseudocode. What part(s) are you having difficulty coding? I'm not sure that someone is going to have the exact code you want lying around, but they can help you implement things.
    EDIT: fixed the link

    Bioinformatics

      Finding motifs sounds like an interesting problem. Do you know of any descriptions of the problem that

      1. Don't require a brain transplant to understand the genetics terminology.
      2. Don't descend into pointless pseudo-mathematical hieroglyphics.

      Ie. Something that describes:

      • What a motif is?
      • How you know when you've found one?
      • Where you start looking?

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        A motif, also termed a consensus sequence, is a stretch of DNA that has either the exact (rare) or a similar sequence in many places across the genome; you can think of it like input to a fuzzy search, close but not exact, maybe a misspelling or two. These sites often serve as places where proteins physically interact and bind along the DNA. There are data collections, based on sequencing data, that allow one to know where along the DNA that a given protein binds, and from which one may look for enriched or common motifs by looking at the sequence from the binding coordinates. It may be that these motifs are associated with the protein in question, or they may be motifs for other proteins which interact with the protein you have data for. Collections of motifs in a region (such as the promoter region, where many proteins bind to turn a gene on, off, increased, or decreased--think dimmer switch) can be refered to as cis regulatory regions. You know you found one when you can see an enrichment or increased frequency over some background (control) sequence. Common programs used in this analysis are MEME and nestedMICA. Does this help a bit?

        Bioinformatics
        You can learn a bit of bioinformatics at Rosalind. By solving problems, you not only gain XP and contest, but also learn about the underlying science.
        لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

        Although the OP is interested in DNA, the term is also used for amino acids/proteins.

        The Wikipedia entry for Sequence motif is pretty clear, I think.

        Also enlightening may be: Sequence logo.

      A reply falls below the community's threshold of quality. You may see it by logging in.
Re: genetic algorithm for motif finding
by Anonymous Monk on Aug 13, 2013 at 06:57 UTC