in reply to Curious Regex

My current regex looks like $text =~ s/(<\/Mil.*>.*?)[\x90\x8F](.*?\x9D)/$1$2/ig; And this only removes the very first control character as expected.

Not even.

Anyway, here's a two-state parser that should do the trick:

my $out = ''; PROCESS: for ($text) { pos() = 0; for (;;) { # Search for Mil element for (;;) { /\G ( [^<]+ ) /xgc && $out .= $1; /\G ( <\/Mil(?:,[^>]*])?> ) /xgc && do { $out .= $1; last }; /\G ( < ) /xgc && $out .= $1; /\G \z /xgc && last PROCESS; } # Search for end of Mil element, # removing \x8F and \x90 as we go along. for (;;) { /\G ( [^\x8F\x90\x9D]+ ) /xgc && $out .= $1; /\G [\x8F\x90]+ /xgc; /\G \x9D /xgc && last; /\G \z /xgc && last PROCESS; } } }