in reply to Re: Regex help
in thread Regex help
That will also extract single lc alphas that are preceded or followed by more than four uc alphas:
>perl -wMstrict -le "my $text = 'XXXXXaXXXXX'; ;; for my $match ( $text =~ /(?<=[A-Z]{4})([a-z])(?=[A-Z]{4})/g ) { print $match, $/; } " a
If AnonyMonk wants single lc alphas that are preceded and followed by exactly four uc alphas (and also concatenated into a string), here's one way:
>perl -wMstrict -le "my $text = qq{XXXXaXXXXbYYYYYcYYYYYdXXXXeXXXXfgXXXX\nXXXXhXXXXiYYYYY}; print qq{[[$text]]}; ;; my $result = join '', $text =~ m{ (?<= (?<! [[:upper:]]) [[:upper:]]{4}) [[:lower:]] (?= [[:upper:]]{4} (?! [[:upper:]])) }xmsg; print qq{'$result'}; " [[XXXXaXXXXbYYYYYcYYYYYdXXXXeXXXXfgXXXX XXXXhXXXXiYYYYY]] 'aeh'
(If some look-around is good, more is better!)
Update: Here are the beginnings of a test bed for playing with this and other regexen:
>perl -wMstrict -le "for my $text (qw( XXXXaXXXX XXXXaXXXXxyXXXXbXXXXxZZZxZZZxYYYYY XXXXxZZZ ZZZxXXXX XXXXxYYYYY YYYYYxXXXX XXXXxyXXXX XXXXxyXXXXxyXXXX YYYYYaYYYYY ZZZaZZZ) ) { my $result = join '', $text =~ m{ (?<= (?<! [[:upper:]]) [[:upper:]]{4}) [[:lower:]] (?= [[:upper:]]{4} (?! [[:upper:]])) }xmsg; print qq{'$text' -> '$result'}; } " 'XXXXaXXXX' -> 'a' 'XXXXaXXXXxyXXXXbXXXXxZZZxZZZxYYYYY' -> 'ab' 'XXXXxZZZ' -> '' 'ZZZxXXXX' -> '' 'XXXXxYYYYY' -> '' 'YYYYYxXXXX' -> '' 'XXXXxyXXXX' -> '' 'XXXXxyXXXXxyXXXX' -> '' 'YYYYYaYYYYY' -> '' 'ZZZaZZZ' -> ''
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Regex help
by Anonymous Monk on Dec 11, 2013 at 11:33 UTC |