in reply to Re: How to Use Pack to Convert UTF-16 Surrogate Pairs to UTF-8?
in thread How to Use Pack to Convert UTF-16 Surrogate Pairs to UTF-8?

I still think it's likely the OP is taking the wrong approach by trying to hand-roll a JSON decoder, so I did want to point out a few issues with your code - hopefully thereby also pointing out some of the pitfalls of hand-rolled approaches.

Update: *sigh* Just because it can be done in one regex, still doesn't mean it should! But just for fun, enough rope to shoot oneself in the foot...

use warnings; use strict; use open qw/:std :encoding(UTF-8)/; use Test::More tests=>1; use Data::Dump qw/pp/; use Encode qw/decode/; my $str = "Ren\\u00e9 \\ud83D\\uDe06\\uDb40\\udDeF Fran\\u00E7oise" ." \\\\u2660\\U1234"; $str =~ s{ (?| (?&U)(d[89ab][0-9a-f]{2}) (?&U)(d[c-f][0-9a-f]{2}) | (?&U)([0-9a-f]{4}) ) (?(DEFINE) (?<U> (?-i) (?<!\\)(?:\\\\)*\\u ) ) }{ decode("UTF-16BE", pack("n*", hex $1, defined $2 ? hex $2 : ()), Encode::FB_CROAK) }iexg; is $str, "Ren\xE9 \x{1F606}\x{E01EF} Fran\xE7oise \\\\u2660\\U1234", $str or diag pp $str;