You're welcome. Not all modules are on CPAN, and in those cases Google is your friend. :-)

Thank you for the brief description of your project. Unfortunately, a complete analysis of TF binding sites and coregulated genes is very difficult if not impossible (at the current moment in time), even in a model organism like Drosophila. Therefore, keep in mind that any results you obtain are going to be incomplete.

1. Extract all the transcription factors (TFBS) for fruitfly (D. Melanogaster).
Until all of the TFs in Drosophila have been characterized, they aren't going to be in transfac (or any other db). It is trivial to grab TF records from transfac that have some data from Drosophila, though, so you can at least get data for those TFs that have been characterized thus far. Go forth and parse (the TFBS modules may help you here).

2. Identify all the coregulated genes of each TFBS found above.
If I understand you correctly, you want to identify the genes that are regulated by each TF. That's a much harder problem, and it has been a subject of active research for many years. I encourage you to do some searches on PubMed for background and talk to someone at your institution that might have experience in the area. This step will likely require some experimental data (e.g., expression microarrays).

It is certainly possible to identify genes that contain sequences that match a given matrix, but that does not mean that the TF actually binds to that site and regulates the gene in question. Take a look at the PATSER program (Hertz and Stormo, Bioinformatics 1999). It's even part of bioperl: Bio::Tools::Run::PiseApplication::patser. :-)

3. Finally, extract the the sequence from -450 upstream to 50 downstream region of each coregulated genes found.
Once you identify the genes, extracting a portion of the sequence is trivial. Bioperl to the rescue (keep in mind that many TF binding sites could be located outside of your defined region).

HTH


In reply to Re^3: Howto fill in username and password in a popup cookie with Perl by bobf
in thread Howto fill in username and password in a popup cookie with Perl by monkfan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.