in reply to Re^2: Japanese: detect hiragana/katakana/fulll width eisuuji
in thread Japanese: detect hiragana/katakana/fulll width eisuuji
http://unicode.org/charts/PDF/UFF00.pdf
To detect:
| [\x{FF01}-\x{FF60}\x{FFE0}-\x{FFE6}] | Full widths ASCII variants, brackets and symbols |
| [\x{FF01}-\x{FF5E}] | Full widths ASCII variants |
| [\x{FF21}-\x{FF3A}] | Full widths ASCII uppercase letters |
| [\x{FF41}-\x{FF5A}] | Full widths ASCII lowercase letters |
| [\x{FF10}-\x{FF19}] | Full widths ASCII digits |
To convert:
my %fullwidth_to_narrow = map chr, ( ( map { $_ => $_-0xFF01+0x21 } 0xFF01..0xFF5E ), 0xFF5F => 0x2985, 0xFF60 => 0x2986, 0xFFE0 => 0x00A2, 0xFFE1 => 0x00A3, 0xFFE2 => 0x00AC, 0xFFE3 => 0x00AF, 0xFFE4 => 0x00A6, 0xFFE5 => 0x00A5, 0xFFE6 => 0x20A9, );
|
|---|