in reply to Re: How to split, join and trim leading / leading white space
in thread How to split, join and trim leading / trailing white space
On Perl v5.22 and higher, splitting on \b{gcb} (extended grapheme cluster boundary) might be better:
$ perl -CSD -le 'print map "-$_- ", split //, "u\x{0308}ber"'
-u- -̈- -b- -e- -r-
$ perl -CSD -le 'print map "-$_- ", split /\b{gcb}/, "u\x{0308}ber"'
-ü- -b- -e- -r-
$ perl -CSD -le 'print map "-$_- ", split //,
"k\x{0301}u\x{032D}o\x{0304}\x{0301}n"'
-k- -́- -u- -̭- -o- -̄- -́- -n-
$ perl -CSD -le 'print map "-$_- ", split /\b{gcb}/,
"k\x{0301}u\x{032D}o\x{0304}\x{0301}n"'
-ḱ- -ṷ- -ṓ- -n-
(If the 2nd and 4th outputs above aren't displaying correctly, like in my browser, they should be "-ü- -b- -e- -r-" and "-ḱ- -ṷ- -ṓ- -n-".)
As an alternative in Perl v5.12 and above, \X can be used. Update 2: E.g. split /\X\K(?=\X)/, ...
Update: Made last sentence more clear.
|
|---|