Re: How to convert between Unicode codepoint and UTF8 character code on Perl?

TIMTOWTDI but here's an illustrative test. See also How to ask better questions using Test::More and sample data

use strict;
use warnings;

use Encode 'encode';
use Unicode::Char;
use Test::More tests => 2;

is to_bytes ('0x03C0'), 'CF80', 'Code point to bytes (0x format)';
is to_bytes ('U+03C0'), 'CF80', 'Code point to bytes (U+ format)';

sub to_bytes {
    my $u = Unicode::Char->new;
    (my $in = shift) =~ s/^(?:0x|U\+)//;
    return uc unpack 'H*', encode 'UTF-8', $u->u ($in);
}
[download]

Translating bytes to code points is left as an exercise.

🦛

Comment on Re: How to convert between Unicode codepoint and UTF8 character code on Perl? Download Code

Replies are listed 'Best First'.
Re^2: How to convert between Unicode codepoint and UTF8 character code on Perl? by wyt248er (Initiate) on Oct 25, 2021 at 04:34 UTC
@hippo The version of my perl is above 5.8, and hence it can surely handle Unicode. However, it does not have the Unicode module. So, `use Unicode::Char;` returns an error "Can't locate Unicode/Char.pm in @INC". Also `use Unicode;` returns an error "Can't locate Unicode.pm in @INC". Can you modify your subroutine `to_bytes` so that it will not use the Unicode module? Thank you.	[reply]
Re^3: How to convert between Unicode codepoint and UTF8 character code on Perl? by hippo (Archbishop) on Oct 25, 2021 at 10:19 UTC
Can you modify your subroutine `to_bytes` so that it will not use the Unicode module? Indeed I can but there is no reason why I should since Unicode::Char is publicly available for you to download and install. See Installing Modules if you do not know how to go about that. 🦛	[reply]
Re^3: How to convert between Unicode codepoint and UTF8 character code on Perl? by Anonymous Monk on Oct 25, 2021 at 07:38 UTC
Yes, even you can use CPAN! Alternatively, use chr to transform code point numbers into wide characters.	[reply]