Re: getting rid of UTF-8

$text =~ s/^\xef\xbb\xbf//g does remove the offending bytes if your string contains the characters you described. (Well, it'll delete the leading sequence, and removing the ^ will have it delete the others too.)

Since you repeatedly claim it doesn't, your data is different than you describe, and we can't help you until you provide a better description of your data (e.g. the output of sprintf "%vX", $string).

fyi, EF BB BF is the UTF-8 encoding of U+FEFF, which is the Byte Order Mark if at the start of the file, and the Zero Width No-Break Space elsewhere.

Comment on Re: getting rid of UTF-8 Select or Download Code

Replies are listed 'Best First'.
Re^2: getting rid of UTF-8 by Anonymous Monk on Nov 25, 2022 at 19:54 UTC
the Zero Width No-Break Space elsewhere. deprecated, use U+2060 instead	[reply]
Re^3: getting rid of UTF-8 by ikegami (Patriarch) on Nov 27, 2022 at 18:01 UTC
Good to know. I don't think it was used as a word joiner. I think the presence of U+FEFF is explained by the concatenation of a BOM-prefixed string to another (the very kind of error that lead to U+2060 WORD JOINER being the new ZWNBSP).	[reply]