in reply to getting rid of UTF-8

$text =~ s/^\xef\xbb\xbf//g does remove the offending bytes if your string contains the characters you described. (Well, it'll delete the leading sequence, and removing the ^ will have it delete the others too.)

Since you repeatedly claim it doesn't, your data is different than you describe, and we can't help you until you provide a better description of your data (e.g. the output of sprintf "%vX", $string).

fyi, EF BB BF is the UTF-8 encoding of U+FEFF, which is the Byte Order Mark if at the start of the file, and the Zero Width No-Break Space elsewhere.

Replies are listed 'Best First'.
Re^2: getting rid of UTF-8
by Anonymous Monk on Nov 25, 2022 at 19:54 UTC
    the Zero Width No-Break Space elsewhere.

    deprecated, use U+2060 instead

      Good to know.

      I don't think it was used as a word joiner. I think the presence of U+FEFF is explained by the concatenation of a BOM-prefixed string to another (the very kind of error that lead to U+2060 WORD JOINER being the new ZWNBSP).