jeanluca has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

parsing ascii files with unpack, like:
open IN,"<$file" ; binmode IN,":utf8"; local $/ ; $str = <IN> ; ($a, $b, $c, $d) = unpack("x5, A2, A10, A1 A2", $str) ;
But now I get an error message:
Wide character in print at ./script.pl line 8.
After these characters unpack unpacks the wrong characters.....
How can I solve this problem with those wide characters ? can I somehow detect them or should I use substr() ???

Thanks a lot
Luca

Replies are listed 'Best First'.
Re: Wide character error: unpack vs substr
by BrowserUk (Patriarch) on Mar 31, 2006 at 19:56 UTC

    You could try starting the template with 'U0....' to force utf interpretation. See perlfunc pack for description.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      what does U0.. mean ?

      Luca

        If you follow the links above to the perldoc pack page, you'll find this description:

        If the pattern begins with a U , the resulting string will be treated as UTF-8-encoded Unicode. You can force UTF-8 encoding on in a string with an initial U0 , and the bytes that follow will be interpreted as Unicode characters. If you don't want this to happen, you can begin your pattern with C0 (or anything else) to force Perl not to UTF-8 encode your string, and then follow this with a U* somewhere in your pattern.

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.