Given that this string that I am retrieving is actually the contents of a binary file then I should be OK to ignore anything to do with UTF8, given that my source code has no eight-bit or more characters.
It depends on how the data is handed to you. Note how below, both byte sequences are \304\243, but they're getting different interpretations based on Perl's internal UTF8 flag. If the module is handing you binary data with some encoding/decoding issues or perhaps the UTF8 flag incorrectly enabled, you'll have these kinds of strange issues that may explain the presence of U+FFFD REPLACEMENT CHARACTER in your original hex dump. Could you show your data with Devel::Peek?
$ perl -CSD -MDevel::Peek -le 'my $x="\x{123}"; print $x; Dump($x)'
ģ
SV = PV(0x1337d70) at 0x1357518
REFCNT = 1
FLAGS = (POK,IsCOW,pPOK,UTF8)
PV = 0x1359790 "\304\243"\0 [UTF8 "\x{123}"]
CUR = 2
LEN = 10
COW_REFCNT = 1
$ perl -CSD -MDevel::Peek -le 'my $x="\304\243"; print $x; Dump($x)'
ģ
SV = PV(0x1e28d70) at 0x1e48518
REFCNT = 1
FLAGS = (POK,IsCOW,pPOK)
PV = 0x1e4a790 "\304\243"\0
CUR = 2
LEN = 10
COW_REFCNT = 1
In reply to Re^4: RT::Client turns occasional binary characters in to wide characters
by haukex
in thread RT::Client turns occasional binary characters in to wide characters
by wardmw
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |