in reply to Bioinformatics: Regex loop, no output
The following uses look behind (?<=...) with a match set [KR]and negative look ahead (?!P) that rejects a "following P" match in a split to slice up the protein:
use strict; use warnings; my @proteins = qw( DAAAAATTLTTTAMTTTTTTCKMMFRPPPPPGGGGGGGGGGGG ALTAMCMNVWEITYHKGSDVNRRASFAQPPPQPPPPLLAIKPASDASD DAAAAATTLTTTAMTTTTTTCK ); for my $protein (@proteins) { my @peptides = split /(?<=[KR])(?!P)/, $protein; next if @peptides < 2; print "Protein: $protein\n"; print "Peptides:\n"; print " $_\n" for @peptides; }
Prints:
Protein: DAAAAATTLTTTAMTTTTTTCKMMFRPPPPPGGGGGGGGGGGG Peptides: DAAAAATTLTTTAMTTTTTTCK MMFRPPPPPGGGGGGGGGGGG Protein: ALTAMCMNVWEITYHKGSDVNRRASFAQPPPQPPPPLLAIKPASDASD Peptides: ALTAMCMNVWEITYHK GSDVNR R ASFAQPPPQPPPPLLAIKPASDASD
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Bioinformatics: Regex loop, no output
by TamaDP (Initiate) on Nov 16, 2015 at 13:15 UTC | |
by Not_a_Number (Prior) on Nov 16, 2015 at 15:12 UTC | |
by TamaDP (Initiate) on Nov 16, 2015 at 15:37 UTC | |
by AnomalousMonk (Archbishop) on Nov 16, 2015 at 16:11 UTC | |
by TamaDP (Initiate) on Nov 16, 2015 at 18:33 UTC |