in reply to uparse - Parse Unicode strings
The penguin is part of my prompt
Download uchar
tux 🐧 uchar --help usage: uchar -v [-m base:count [ -m base:count ] ... uchar -v -f char ... perl 5.38.0 with Unicode 15.0.0 -m show maps -v verbosity -l list GBA characters -f find -F find (only chars supported in current font) -s splash all characters found into a single string -k show matching key combo(s) -d apply random diacricals -e show character encodings (uchar -e -f u_BREVE) -o also show octal version of encoding -E show character decodings (uchar -E fc) -b strip to base -D show codepoints in decimal -c copy found string(s) to clipboard -h also show html entity if available
tux 🐧 uchar -v X🩼X
X U00058 \N{LATIN CAPITAL LETTER X}
🩼 U1fa7c \N{CRUTCH}
X U00058 \N{LATIN CAPITAL LETTER X}
tux 🐧 uchar -v U+1f427
🐧 U1f427 \N{PENGUIN}
tux 🐧 uchar -e U+1f427
🐧 U1f427 \N{PENGUIN}
cp1026 6f
cp1047 6f
cp37 6f
cp424 6f
cp500 6f
cp875 6f
gb12345-raw 22
gb2312-raw 22
hz 22
iso-2022-kr 1b2429435c787b31663432377d
iso-ir-165 22
jis0208-raw 20
jis0212-raw 22
ksc5601-raw 22
posix-bc 6f
UCS-2BE fffd
UCS-2LE fdff
UTF-16 feffd83ddc27
UTF-16BE d83ddc27
UTF-16LE 3dd827dc
UTF-32 0000feff0001f427
UTF-32BE 0001f427
UTF-32LE 27f40100
UTF-7 2b324433634a772d
utf-8-strict f09f90a7
utf8 f09f90a7
tux 🐧 uchar -E f09f90a7 | grep utf
utf-8-strict 🐧
utf8 🐧 (U+1F427)
tux 🐧 uchar -Fk "L WITH STROKE"
Searching for (?^u:\bL WITH STROKE\b)
000141 Ł LSTROKE_IDX LATIN CAPITAL LETTER L WITH STROKE
#<Multi_key> <L> <minus>
#<Multi_key> <minus> <L>
<Multi_key> <L> <slash>
<Multi_key> <L> <underscore>
<Multi_key> <slash> <L>
<Multi_key> <underscore> <L>
000142 ł lSTROKE_IDX LATIN SMALL LETTER L WITH STROKE
#<Multi_key> <l> <minus>
#<Multi_key> <minus> <l>
<Multi_key> <l> <slash>
<Multi_key> <l> <underscore>
<Multi_key> <slash> <l>
<Multi_key> <underscore> <l>
tux $ perl -CEO -wE'say "\x{1F468}\x{1F3FD}\x{200D}\x{2708}\x{FE0F}"'
👨🏽✈️
tux $ raku -e'"\x[1F468]\x[1F3FD]\x[200D]\x[2708]\x[FE0F]".say'
👨🏽✈️
tux $ raku -e'"\x[1F468]\x[1F3FD]\x[200D]\x[2708]\x[FE0F]".say' | xarg +s uchar -v
👨 U1f468 \N{MAN}
🏽 U1f3fd \N{EMOJI MODIFIER FITZPATRICK TYPE-4}
U0200d \N{ZERO WIDTH JOINER}
✈ U02708 \N{AIRPLANE}
️ U0fe0f \N{VARIATION SELECTOR-16}
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: uparse - Parse Unicode strings
by kcott (Archbishop) on Nov 18, 2023 at 13:55 UTC | |
|
Re^2: uparse - Parse Unicode strings
by eyepopslikeamosquito (Archbishop) on Nov 19, 2023 at 02:50 UTC | |
by Tux (Canon) on Nov 20, 2023 at 08:54 UTC | |
by eyepopslikeamosquito (Archbishop) on Dec 02, 2023 at 10:07 UTC | |
by hippo (Archbishop) on Dec 02, 2023 at 10:51 UTC | |
by afoken (Chancellor) on Dec 02, 2023 at 12:50 UTC | |
| |
by kcott (Archbishop) on Dec 02, 2023 at 11:01 UTC | |
by eyepopslikeamosquito (Archbishop) on Nov 20, 2023 at 11:57 UTC |