in reply to How to convert between Unicode codepoint and UTF8 character code on Perl?

TIMTOWTDI but here's an illustrative test. See also How to ask better questions using Test::More and sample data

use strict; use warnings; use Encode 'encode'; use Unicode::Char; use Test::More tests => 2; is to_bytes ('0x03C0'), 'CF80', 'Code point to bytes (0x format)'; is to_bytes ('U+03C0'), 'CF80', 'Code point to bytes (U+ format)'; sub to_bytes { my $u = Unicode::Char->new; (my $in = shift) =~ s/^(?:0x|U\+)//; return uc unpack 'H*', encode 'UTF-8', $u->u ($in); }

Translating bytes to code points is left as an exercise.


🦛

  • Comment on Re: How to convert between Unicode codepoint and UTF8 character code on Perl?
  • Download Code

Replies are listed 'Best First'.
Re^2: How to convert between Unicode codepoint and UTF8 character code on Perl?
by wyt248er (Initiate) on Oct 25, 2021 at 04:34 UTC

    @hippo

    The version of my perl is above 5.8, and hence it can surely handle Unicode. However, it does not have the Unicode module. So, `use Unicode::Char;` returns an error "Can't locate Unicode/Char.pm in @INC". Also `use Unicode;` returns an error "Can't locate Unicode.pm in @INC".

    Can you modify your subroutine `to_bytes` so that it will not use the Unicode module?

    Thank you.

      Can you modify your subroutine `to_bytes` so that it will not use the Unicode module?

      Indeed I can but there is no reason why I should since Unicode::Char is publicly available for you to download and install. See Installing Modules if you do not know how to go about that.


      🦛