in reply to Different behaviour in characters in string vs. array?

You tell us the bytes are UTF-8 encoded characters, but you tell Perl they're iso-latin-1. Adding use utf8; will help. That tells Perl the source is encoded using UTF-8 rather than iso-latin-1.

Replies are listed 'Best First'.
Re^2: Different behaviour in characters in string vs. array?
by Anonymous Monk on Dec 10, 2008 at 22:24 UTC

    Yes, I know about UTF-8 (and also about 'encoding') - I'm sorry I didn't say it like this. I would like to know why it is different between string and list. That is confusing to me.

      perldoc -f split
      A pattern matching the null string (not to be confused with a null pattern // , which is just one member of the set of patterns matching a null string) will split the value of EXPR into separate characters at each point it matches that way.

      The characters are one byte length unless you specify utf8 encoding, so split splits every double byte russian charachter to a couple ASCII characters.

      I would like to know why it is different between string and list

      qw() does split ' ' (separates words), not split // (separates characters). Had you used the former, you would have gotten the same result.

      You might think those two are the same in this case, but they're not because of the bug I indentified.

        Ah! Now I understand. Thank you very much, ccn and ikegami!