in reply to Progressive pattern matching
Again, this may be a suboptimal solution and I can't shake the nagging feeling that there's a simpler way.
Of course, that describes all my perl experiance to date :-)
Apologies to all and sundry for once again descending into biological jargon. It's the only way I can get my head round this stuff...
Have fun,#!/usr/bin/perl -w use strict; use warnings; my $seq="APKLGIYSPRIGLYHFHKLDTPRLGAKLJHHDGFYSDA"; my @motif=("ST","P","RK","ILVF","G","ILVFM","Y"); # set up motif array of arrays my @motifarray; for (my $e=0;$e<=$#motif;$e++){ my @elementarray= split (/ */, $motif[$e]); $motifarray[$e]=\@elementarray; } my $mstartpos = 0; # starting point within motif my $success = 0; # cycle through starting motif residues ("ST","P" etc.) MOTIFRES: while ($mstartpos+1 < $#motif){ # find all matches for a given starting motif residue my $test=$seq; my $lastmatchpos=0; while ($lastmatchpos < length($seq)){ my $found=''; # deal with the first 3 residue matches as a special case my @r0=@{$motifarray[$mstartpos]}; my @r1=@{$motifarray[$mstartpos+1]}; my @r2=@{$motifarray[$mstartpos+2]}; if ($test=~ /([@r0])(?=[@r1][@r2])/gc){ $found = $1; $lastmatchpos=pos($test); } # next motif starting residue if no further matches found unless ($found){ $mstartpos++; next MOTIFRES; } # get all the other residues in the motif for (my $e=$mstartpos+1;$e<=$#motifarray;$e++){ my @rn=@{$motifarray[$e]}; if ($test=~ /\G([@rn])/gc){ $found .= $1; } } # print out what we've got so far $success++; print ("$found at $lastmatchpos\n"); } # repeat, using the next motif residue as the new starting point $mstartpos++; } die ("No matches found.\n") unless ($success); print ("Total number of matches (nested or otherwise): $success\n");
Update: Minor bugfix; also removed a couple of superfluous and misconceived lines to tidy it up a bit.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Progressive pattern matching
by blakem (Monsignor) on Oct 17, 2001 at 15:33 UTC | |
by tfrayner (Curate) on Oct 17, 2001 at 17:25 UTC | |
by blakem (Monsignor) on Oct 18, 2001 at 02:40 UTC |