in reply to Re^2: How to find any of many motifs?
in thread How to find any of many motifs?

Here are the actual sequnces I analyse: for GENES:
>uc002yje.1 chr21:13973492-13976330 cccctgccccaccgcaccctggattactgcacgccaagaccctcacctga acgcgccctacactctggcatgggggaacccggccccgcagagccctgga CTCTGACATTGGAGGACTCCTCGGCTACGTCCTGGACTCCTGCACAAGAG >uc002yje.1 chr21:13973492-13976330 cccctgccccaccgcaccctggattactgcacgccaagaccctcacctga acgcgccctacactctggcatgggggaaaaaacccggccccgcagagccctgga CTCTGACATTGGAGGACTCCTCGGCTACGTCCTGGACTCCTGCACAAGAG >uc002yje.1 chr21:13973492-13976330 cccctgccccaccgcaccctggattactgcacgccaagaccctcacctga acgcgccctacactctggcatgggggaacccggccccgcagagggccctgga CTCTGACATTGGAGGACTCCTCGGCTACGTCCTGGACTCCTGCACAAGAG for motifs: >ucmotif_1 gccccac >ucmotif_2 gggggaaaaaacc >ucmotif_3 agagggccc here is the output: the distance is the following: 88 the distance is the following: 20 the distance is the following: 4
So, now the program searches for the first element in the first gene and for the second in the second and so on. I wish that it could find any of motifs for each gene if they present in it and count the length between the motif an exon start point (starts with capital letters).

Replies are listed 'Best First'.
Re^4: How to find any of many motifs?
by almut (Canon) on Jun 17, 2010 at 16:56 UTC
    I wish that it could find any of motifs for each gene

    So it seems you want to test all motifs against every gene.  In case the list of all motifs fits into working memory, you could read them into an array (once, at the beginning), and then iterate over it for every gene.  Roughly sketched:

    my @motifs; while (my $motif = <MOTIF>) { ... push @motifs, $motif; } while (my $seq = <FILE>) { ... for my $motif (@motifs) { if ( $seq =~ /$motif/ ) { ... } ... } }