in reply to \b in Unicode regex
G'day Arik123,
Two pieces of information, from perlrebackslash, to note.
From the "Character classes" section:
"\w s a character class that matches any single word character (letters, digits, Unicode marks, and connector punctuation (like the underscore))." [my emphasis]
From the "Assertions" section:
"\b ... matches at any place between a word (something matched by \w) and a non-word character" [my emphasis again]
In your reply with actual data, you're effectively trying to match "XXXXX", which occurs in your string as "_XXXXX.". Both '_' and 'X' match "\w": "\b" does not match between '_' and 'X'.
As already demonstrated twice[1,2], there is no Unicode issue here.
— Ken
In Section
Seekers of Perl Wisdom