in reply to regex pattern match problem
OK, now that that can be read, let's factor out the common ground:$line =~ s/ \*{2} ( [^\*]+ ) \*{2} \s ( kinase | isoform | protein | peptide | ligand ) \s \${2} ( [^\$]+ ) \${2} \s [(,] \s \*{2} ( [^\*]+ ) \*{2} \s [),] /**$1_$2_$3_($4)**/gx
Note that this will match fields delimited like ( field , or , field ), which you probably don't want.sub delimited { my ( $delimiter ) = @_; my $qdelimiter = quotemeta $delimiter; return qr/ $qdelimiter{2} ( [^$qdelimiter]+ ) $qdelimiter{2} /x; } sub balanced { my ( $inside ) = @_; return qr/ [(,] $inside [),] /x; } my $stars = delimited '*'; my $dollars = delimited '$'; my $words = qr/( kinase | isoform | protein | peptide | ligand )/x; my $parens = balanced qr/ \s $stars \s /x; $line =~ s/ $stars \s $words \s $dollars \s $parens /**$1_$2_$3_($4)** +/gx;
UPDATE 1: I also stripped out the if ( MATCH ) logic, because a substitution s/OLD/NEW/ is just a no-op if OLD doesn't match.
UPDATE 2: Changed formatting and corrected a few errors in the code.
UPDATE 3: Again.
|
|---|