in reply to Re^3: Another utf-8 decoding problem
in thread Another utf-8 decoding problem

Ok. So I should encode the data from utf-8 and decode it to iso? Why do I then have to set binmode if I have correct encoding of my string? This is greek to me, sorry_:-)

Replies are listed 'Best First'.
Re^5: Another utf-8 decoding problem
by moritz (Cardinal) on Oct 11, 2010 at 12:48 UTC
    Ok. So I should encode the data from utf-8 and decode it to iso?

    No. Don't go mixing all the terms I've used.

    You should decode incoming data (from UTF-8 or whatever encoding it is) into perl's internal format.

    Then do your string operations with decoded strings.

    Then when you ouput it, encode it. It's not the right format already - it's in Perl's internal format, which can be either latin1 or UTF-8, depending on some factors you shouldn't care about.

    Please read this, it explains it all in sufficient detail (I hope).

    Perl 6 - links to (nearly) everything that is Perl 6.
      I meant "decode from" and "encode to" - sorry for the mixup

      The thing is that everything else that is printed (without setting the binmode) gets printed ok (does this mean that I already have the correct output mode?). So it feels like I'm trying to decode data that isn't utf-8 from the beginning? (Can I test the incoming data in a simple manner?) I will look into the link you provided.
        (Can I test the incoming data in a simple manner?)

        See the documentation for Encode::decode - you can tell it to die on invalid input.

        Perl 6 - links to (nearly) everything that is Perl 6.