Hi, there are a couple of things I see now looking on a full-size screen ...

First, in the second set of statements you encode $str to $secondaryOctets but then print the output of decoding $str.

The above does not fix the issue, though. When you tell Perl to use utf8; on the source code, it reads any high unicode characters in as characters, rather than as a sequence of bytes. This works well for your first case when you decode from and to UTF-8. But since ISO-8859-1 doesn't know about multi-byte characters, you get the wide character error. You should not tell Perl that the source code is in UTF-8 *if* you plan to read it in as bytes.

Similarly but separately, when you apply the ':utf8' IO layer to STDOUT, you are telling Perl that the output is going to be encoded in UTF-8. That's not the case when you've encoded to ISO-8859-1, so you shouldn't apply the layer.

The following script attempts to demonstrate what I mean:

use strict; use warnings; use feature 'say';
use Encode;
use Class::Unload;

{
    say 'With UTF-8';
    use utf8;
    my $str = '這是一個測試';
    my $perl = encode("utf8", $str);

    binmode STDOUT, ':utf8';
    say decode("utf8", $perl);
}

{
    say 'With ISO-8859-1';
    Class::Unload->unload('utf8');
    my $str = '這是一個測試';
    my $perl = encode("ISO-8859-1" , $str);

    binmode STDOUT;
    say decode("ISO-8859-1", $perl);
}

__END__

Outputs:

$ perl 1203139.pl

With UTF-8
這是一個測試
With ISO-8859-1
這是一個測試

Disclaimer: Working with encodings is very complicated, as you know, and I am not an expert in the field. As this example shows there can be multiple overlaying issues, and it's possible for a script to appear to be working right when it's just an accident. So while it is my best understanding, I don't guarantee that my explanation here is correct.

Hope this helps!


The way forward always starts with a minimal test.

In reply to Re: How to encode and decode chinese string to iso-8859-1 encoding format by 1nickt
in thread How to encode and decode chinese string to iso-8859-1 encoding format by thanos1983

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.