So there's nothing wrong with your version of perl and it correctly matches the UTF-8 accented characters with \p{Word}, and presumably also with \w if you change the value of $re thus: my $re = qr/^([\/\w]+)/;
Are you definitely decoding the contents of these files when you read them in your perl script?
Might also be worth checking the actual data in the data files with eg. hexdump.
🦛
In reply to Re^5: UTF8 versus \w in pattern matching
by hippo
in thread UTF8 versus \w in pattern matching
by mldvx4
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |