in reply to Arabic to Hex and Hex to Arabic
G'day thanos1983,
You can get the code points, without firing up the regex engine, like this:
$ perl -Mutf8 -C -E 'my $x = "ﻟﻠﺒﻴﻊ"; say sprintf "%x", ord substr $x, $_, 1 for 0 .. length($x) - 1' fedf fee0 fe92 fef4 feca
I don't speak, read or write Arabic; however, checking against Unicode's (PDF) code chart "Arabic Presentation Forms-B", these certainly appear correct.
You asked about getting a "0x" prefix. You can do that with sprintf by changing "%x" to "%#x".
$ perl -Mutf8 -C -E 'my $x = "ﻟﻠﺒﻴﻊ"; say sprintf "%#x", ord substr $x, $_, 1 for 0 .. length($x) - 1' 0xfedf 0xfee0 0xfe92 0xfef4 0xfeca
I don't know anything about UCS, so I might be missing something here. The output you show under "UCS-2", is just the code points, from my first one-liner, as pairs of hex digits (which, obviously, you could get with substr - still not needing a regex).
I accidentally generated what you show as "UTF-8" output, when I initially wrote that first one-liner, because I forgot to add the utf8 pragma.
$ perl -C -E 'my $x = "ﻟﻠﺒﻴﻊ"; say sprintf "%x", ord substr $x, $_, 1 for 0 .. length($x) - 1' ef bb 9f ef bb a0 ef ba 92 ef bb b4 ef bb 8a
Anyway, knowing neither Arabic nor UCS, I don't want to draw any inferences from that output. It might, however, provide you with some insights.
The second part of your title was "... Hex to Arabic". Just printing the hex output I first got, gives me the original Arabic string.
$ perl -C -E 'say "\x{fedf}\x{fee0}\x{fe92}\x{fef4}\x{feca}"'
ﻟﻠﺒﻴﻊ
P.S. I'm using 5.26.0.
— Ken
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Arabic to Hex and Hex to Arabic
by afoken (Chancellor) on Jul 29, 2017 at 10:58 UTC | |
|
Re^2: Arabic to Hex and Hex to Arabic
by thanos1983 (Parson) on Jul 29, 2017 at 16:02 UTC |