My current regex looks like $text =~ s/(<\/Mil.*>.*?)[\x90\x8F](.*?\x9D)/$1$2/ig; And this only removes the very first control character as expected.
Not even.
Anyway, here's a two-state parser that should do the trick:
my $out = ''; PROCESS: for ($text) { pos() = 0; for (;;) { # Search for Mil element for (;;) { /\G ( [^<]+ ) /xgc && $out .= $1; /\G ( <\/Mil(?:,[^>]*])?> ) /xgc && do { $out .= $1; last }; /\G ( < ) /xgc && $out .= $1; /\G \z /xgc && last PROCESS; } # Search for end of Mil element, # removing \x8F and \x90 as we go along. for (;;) { /\G ( [^\x8F\x90\x9D]+ ) /xgc && $out .= $1; /\G [\x8F\x90]+ /xgc; /\G \x9D /xgc && last; /\G \z /xgc && last PROCESS; } } }
In reply to Re: Curious Regex
by ikegami
in thread Curious Regex
by HamNRye
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |