in reply to Re^3: Encoding horridness
in thread Encoding horridness

What I'm wondering, though, is if there's ever a situation where
encode('utf8', decode('Latin-1', $_))
produces different output from
encode('utf8', $_)

Replies are listed 'Best First'.
Re^5: Encoding horridness
by choroba (Cardinal) on Jul 12, 2017 at 16:51 UTC
    Yes, for example:
    $_ = decode('utf-8', "\N{LATIN SMALL LETTER A WITH ACUTE}"); say encode('utf8', $_); # Replacement character EF +BFBD. say encode('utf8', decode('Latin-1', $_)); # Dies.
    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
      Fine. If the decode doesn't die, does it ever produce different output? (One might argue that call that dies doesn't produce any output, and therefore does not produce different output, but whatever.)