in reply to Re^3: length() miscounting UTF8 characters?
in thread length() miscounting UTF8 characters?

Right, thanks again! I hadn't thought about codepoints vs. characters, but I'll keep this in mind; combining accents and other diacritics in particular I might well encounter.

Searching CPAN shows that there's a module for this, Unicode::Normalize, which I'll look into.

  • Comment on Re^4: length() miscounting UTF8 characters?