Problems? Is your data what you think it is? | |
PerlMonks |
How to Use Pack to Convert UTF-16 Surrogate Pairs to UTF-8?by WingedKnight (Novice) |
on Jun 09, 2022 at 00:26 UTC ( [id://11144531]=perlquestion: print w/replies, xml ) | Need Help?? |
WingedKnight has asked for the wisdom of the Perl Monks concerning the following question: I have input strings which contain text in which some characters are in UTF-16 format and escaped with '\u'. I am trying to convert all the strings to UTF-8. For example, the string 'Alice & Bob & Carol' might be formatted in the input as: 'Alice \u0026 Bob \u0026 Carol' To do my desired conversion, I was doing...: $str =~ s/\\u([A-Fa-f0-9]{4})/pack("U", hex($1))/eg; ...which worked fine until I got to input strings that contained UTF-16 surrogate pairs like: 'Alice \ud83d\ude06 Bob' How do I modify the above code that uses pack to work with UTF-16 surrogate pairs? I would really like a solution that just uses pack without having to use any additional libraries (JSON::XS, Encode, etc.).
Back to
Seekers of Perl Wisdom
|
|