in reply to Re^6: UTF8 versus \w in pattern matching
in thread UTF8 versus \w in pattern matching
Using the formula my $re = qr/^([\/\w]+)/; as the pattern has the same problems.
For clarity, the test script which I provided works just as well with this regex. The point is that it demonstrates that there is nothing wrong with your perl code which does the regex matching and therefore the only logical conclusion is that your data is not what you think it is.
Are you decoding your UTF-8 data when you read it from the data files in your script? If not, that is the problem.
If you can provide a real SSCCE then I'm sure all will become clear.
🦛
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^8: UTF8 versus \w in pattern matching
by mldvx4 (Hermit) on Jul 06, 2021 at 13:54 UTC | |
by haj (Vicar) on Jul 06, 2021 at 18:21 UTC | |
by pryrt (Abbot) on Jul 06, 2021 at 18:49 UTC | |
by ikegami (Patriarch) on Jul 06, 2021 at 21:01 UTC |