in reply to Read and write UTF-8

All

Thank you for the thoughts. It works now but it was a strange sequence of issues. One issue was I had to remove the hyphen in "UTF-8". Whatever version I am using simply doesn't like that hyphen. Second, I had tabs in the input file and kept thinking they were a series of blanks. So it was counting characters correctly all along (silly me).

Thank you for the suggestion to remove the BOM. That was important as it was counting that as a character too.

Thanks again.

Replies are listed 'Best First'.
Re^2: Read and write UTF-8
by choroba (Cardinal) on Oct 18, 2016 at 10:08 UTC
    Note that UTF8 and UTF-8 aren't equivalent:
     $ perl -lwE 'binmode STDOUT, ":encoding(UTF8)"; print chr 10240000;'
    Code point 0x9C4000 is not Unicode, may not be portable at -e line 1.
    �����
     $ perl -lwE 'binmode STDOUT, ":encoding(UTF-8)"; print chr 10240000;'
    Code point 0x9C4000 is not Unicode, may not be portable at -e line 1.
    "\x{9c4000}" does not map to utf8 at -e line 1.
    \x{9C4000}
    

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,