You're assigning meaning to the bytes without checking which encoding was used. In fact, not only do you not handle the character encoding, you don't handle transfer encoding either! Using ->content() is practically always a bug. One should use ->decoded_content() or ->decoded_content( charset => 'none' ) instead.