in reply to Malformed UTF-8 character

The utf8 module enables you to use UTF8 in your source code (e.g. variable name, subroutine names, strings, etc.), it is not aimed at converting incoming data. If you have UTF-8 identifiers, you don't need to convert them, they are already (presumably) in the right format.

Or did I miss your point?

  • Comment on Re: Malformed UTF-8 character (unexpected continuation byte 0x96, with no preceding start byte)

Replies are listed 'Best First'.
Re^2: Malformed UTF-8 character (unexpected continuation byte 0x96, with no preceding start byte)
by Yllar (Novice) on Aug 07, 2015 at 09:09 UTC

    Thanks for your input.the issue here is the input is in utf-8 format. In my code we have two statements$data,$data2.If you observer them there is a difference in both the statements.

    In the first statement '-' is printing as it is after decoded.where as in second statement '–' is not printing as it is after decoded.'–' is replaced with another character like ^. I need to print the second statement as it is.

    Thank you.

      It seems to work for me:
      $ perl -E 'use utf8; my $data = "text - abcd"; say "$data";' text - abcd $ $ perl -E 'use utf8; my $data = "text – abcd"; print "$data";' Wide character in print at -e line 1. text – abcd $ $ perl -E 'use utf8; my $data = "text – abcd"; binmode STDOUT, ":utf8 +"; say "$data";' text – abcd
      It might be hard to see the difference on the screen, but I zoomed on the output and I can confirm that I have really printed two different species of dash.
      the code you posted does no printing -- did you remember to binmode?

        I am sorry for that. Here is the out put for the above statement.

        text - abcd

        text ΤΗτ abcd

        If you see here.. the second statement is printing weird characters.Insted of printing 'text – abcd' it is priting 'text ΤΗτ abcd'.

        Any help would be appreciated greatly. Thank you