in reply to Re^2: Regular Expressions on Unicode
in thread Regular Expressions on Unicode

That's roughly how I would have done it, except that :decoding(UTF-8) is wrong, it's still :encoding(UTF-8).

Here is a working example how to search for that character:

use strict; use warnings; use charnames qw(:full); binmode STDOUT, ':encoding(UTF-8)'; my $filename = 'test.txt'; if (@ARGV) { open my $handle, '>:encoding(UTF-8)', $filename or die "Can't write to file '$filename': $!"; print $handle <<"OUT"; The next line contains a\N{MODIFIER LETTER GLOTTAL STOP} Really! OUT close $handle or warn $!; } else { open my $handle, '<:encoding(UTF-8)', $filename or die "Can't open file '$filename' for reading: $!"; for (<$handle>) { print if /\N{MODIFIER LETTER GLOTTAL STOP}/; } close $handle; }

When you call it with command line arguments it writes a test file, when called without any that test file is read again:

$ perl sample.pl gen
$ perl sample.pl 
contains aˀ

I hope this help, you can gradually morph it into the program you want, when you change something and it breaks you know what's wrong.