in reply to Regular expression problems

if($readingframe2 =~ /MHGR/){ print "$1\n"; ($protein)= $readingframe2 =~ /MHGR/;
$1 contains what you capture with parens: no parens here in regex, so nothing in $1.
What would you like to find in $prottein ?

example: $ perl -e '$s="pilpoil";if ($s=~/p(.)lp(.)il/) { print "h".$1."h".$2;} +' hiho

Replies are listed 'Best First'.
Re^2: Regular expression problems
by rattytatty (Initiate) on Apr 24, 2012 at 08:22 UTC
    Thanks for the help. I used this as an example but I will be using it to find an ORF (A Start codon'M', followed by any of the rest of the Amino Acids until a Stop codon'_' I think that will be: =~/(M)[GAVLIFWPSTCYNQDEKRH]+(_)/ or =~/(M[GAVLIFWPSTCYNQDEKRH]+_)/ I'm not sure but I will try both. Thanks again.
      You can want to add 'M' and '_' in parens to capture in one shot or concatanate 'M'.$1.'_' depending of what you want to do. One way to do it:
      #!/usr/bin/perl -w use strict; my $seqnum=1; while (my $seq=<DATA>) { chomp $seq; print "sequence #",$seqnum++,":\n"; while ($seq =~ /M([GAVLIFWPSTCYNQDEKRH]+)_/g) { print "\t",$1,"\n"; # or: print "\t","M${1}_","\n"; } } __DATA__ MHGRRRRRRRRRRRRRRRRRRRRRRRRRRRRRD_MHGRRRRRRRRD_ CMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAVTECMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAVT AWPPPPPPPPPPPPPPPPPPPPPPPPPPPPP_LNAWPPPPPPPPPPPPPPPPPPPPPPPPPPPPP_L FOOBARMNOTTHISONE_XYZMTHISYES_MTHISNOT_