Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: How to Use Pack to Convert UTF-16 Surrogate Pairs to UTF-8?

by NERDVANA (Deacon)
on Jun 09, 2022 at 01:01 UTC ( [id://11144533]=note: print w/replies, xml ) Need Help??


in reply to How to Use Pack to Convert UTF-16 Surrogate Pairs to UTF-8?

If you don't know the encoding of your input, a cheap hack to "fix it" is utf8::decode($string); Call it multiple times if you think the input might be multiply utf8 encoded. Strictly speaking, this is wrong, and could damage real unicode strings that happen to look like UTF8 sequences. Practically speaking, it just "fixes things" and you can get on with the rest of your work.

The only *correct* way to decode things is to know the encoding that was given to your program, then use the Encode module. BTW, the Encode module is a core perl module, and not something you should try to avoid.

As a sidenote, I would use chr(hex $1) instead of pack("U", hex($1))

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11144533]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (2)
As of 2024-04-25 20:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found