in reply to Re^7: Any good ways to handle NARROW NO-BREAK SPACE characters in regex in newer versions of Perl?
in thread Any good ways to handle NARROW NO-BREAK SPACE characters in regex in newer versions of Perl?
my @files = glob("*");
Here, the filenames you read in are encoded as bytes. See them as bytes:
use Data::Dumper; $Data::Dumper::Useqq = 1; my @files = glob("*"); for my $file (@files) { say Dumper $file; };
Now, if you want to use Unicode matching semantics, you want to decode your filenames from the filesystem representation into Unicode:
use Encode 'decode'; use Data::Dumper; $Data::Dumper::Useqq = 1; my @files = map { decode 'UTF-8', $_ } glob("*"); for my $file (@files) { say Dumper $file; };
The filesystem operations take raw strings, but your regular expression takes a Unicode string. Use the correct one in each situation.
|
|---|