misterperl has asked for the wisdom of the Perl Monks concerning the following question:
I've been a Perl programmer since about 1997 and I know a lot about regexes. Not a guru perhaps, but at least a master :) But this I don't get.
I had a string like XT3USI , and I want to test to see if it began with (any char), followed by T, followed by 2 or S. So I used /^.T2|S/
which looked perfect. Reading from L to R, "begins with a char, then a T, then a (2 or an S)". But instead it acted like I had used: /^.T(2|.*S)/
2|S, in my mind, should have alternated chars 2 or S, not 2 or (0-many other chars) followed by S.
I replaced it with /^.(T2|TS)/ which bugs me. But works..
Advice is, as always, appreciated. I use it all the time, but maybe all these years I misunderstood alternation.
Re: Perl alternation regex looks ok to me?
by toolic (Bishop) on Sep 18, 2018 at 13:59 UTC
|
The regular expression:
(?-imsx:^.T2|S)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
. any character except \n
----------------------------------------------------------------------
T2 'T2'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
S 'S'
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
You should use parentheses to group the alternation:
/^.T(2|S)/
Or, if there is no need to capture:
/^.T(?:2|S)/
| [reply] [d/l] [select] |
|
Or, if each alternative is one character
/^.T[2S])/
($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord
}map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
| [reply] [d/l] [select] |
Re: Perl alternation regex looks ok to me?
by hippo (Bishop) on Sep 18, 2018 at 14:19 UTC
|
2|S, in my mind, should have alternated chars 2 or S, not 2 or (0-many other chars) followed by S.
In isolation, yes. But you have used it in combination with other elements without limiting the alternation.
Far easier if you just want alternate characters to use a character class, eg: /^.T[2S]/
HTH.
| [reply] [d/l] |
|
thanks I guess a character class might work out.
| [reply] |
Re: Perl alternation regex looks ok to me?
by Laurent_R (Canon) on Sep 18, 2018 at 16:04 UTC
|
You've been given good solutions already, but, just to explain further, note that alternatives have a very low precedence in regexes, so that, for example, /blue|green/ is understood as an alternative between the two colors (either blue or green), and not as something like /blu(e|g)reen/ (i.e. either bluereen or blugreen).
| [reply] [d/l] [select] |
|