in reply to Re: How to find any of many motifs?
in thread How to find any of many motifs?

Thanks a million for your help. I hope I'll finish soon BioPerl installation. Your script is great and it does the job I tried to get from mine. But I have one problem: in the case when the gene doesn't contain any of motifs, the output looks like following:
NM_002232 range=chr1:111015833-111020178 contains ; distance 1014
So, as far as I understand it prints out the length of the whole region situated before the exon start point (all the small letters ). Actually this kind of information is not required in the task. How this can be skipped in the outpt? Thank you once more for your great contribution into my biologilcal invstigations %)!!! P.S: this morning I have modified my script so, it almost does the job too, but there are still some problems to solve. But your support has helped me a lot!

Replies are listed 'Best First'.
Re^3: How to find any of many motifs?
by BrowserUk (Patriarch) on Jun 18, 2010 at 09:03 UTC

    Never mind. I think I see the problem.

    Your motifs file contains some blank lines. A blank line will become undef, and undef will match anything.

    A simple modification to my code to deal with this is:

    if( defined $motif and $seq =~ m[$motif([^A-Z]+)[A-Z]] ) {

    It would perhaps be better to deal with this at source--ie. remove blank lines from the motifs file--but data is rarely perfect in the real world, so the above fix is as good a way of dealing with it as any.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re^3: How to find any of many motifs?
by BrowserUk (Patriarch) on Jun 18, 2010 at 08:58 UTC
    But I have one problem: in the case when the gene doesn't contain any of motifs, the output looks like following:
    NM_002232 range=chr1:111015833-111020178 contains ; distance 1014

    So, as far as I understand it prints out the length of the whole region situated before the exon start point (all the small letters ). Actually this kind of information is not required in the task. How this can be skipped in the outpt?

    Hm. That shouldn't happen. It should only print something if a match is found.

    Could you please post the relevant motif & sequence that correspond to the above output so that I can use it to debug my code?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.