in reply to Re^2: Unicode regular expressions
in thread Unicode regular expressions
As for matching Unicode letters, we have:
"ญᴥ一ךى" =~ /^\p{L}+$/
which is a sequence of (Unicode) letters, but from 5 different scripts. Do you want to match that?
And then I haven't touch the can of worms called 'combining sequences'. Many (all?) of the accented Unicode characters can also be formed by taking the base character, and adding the various decorations to them. Not to mention that most combinations of a base character and decorations don't have a Unicode code point, and will have to be made by combining sequences.
|
|---|