shamat has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks! I'm having troubles building a simple script that matches "special characters" in a case insensitive regex. Here is the code (based on this: http://www.perlmonks.org/?node_id=929400):
use feature 'unicode_strings'; my $string = 'ŠIN'; print $string =~ /šin/i ? 'matched' : 'no match';
I got 'no match', what can I do? I'm using strawberry perl 5.14.2 on a Win 7 machine. Thanks for your help!
  • Comment on perl 5.14 regex: case insensitive match on international characters
  • Download Code

Replies are listed 'Best First'.
Re: perl 5.14 regex: case insensitive match on international characters
by Eliya (Vicar) on Mar 15, 2012 at 15:52 UTC

    In case your source code (the string literals) is UTF-8 encoded, you want to add use utf8; — in which case it works fine for me (with the same version of Perl).

      Thanks for the quick reply! If I add use utf8; the regex matches, but I get:

      Malformed UTF-8 character (unexpected continuation byte 0x9a, with no preceding start byte) at utf temp8.pl line 4.

      Any suggestion on how to deal with this?

        As well as use utf8 make sure your script is actually saved as UTF8. (Your editor will probably offer a choice of encodings upon saving the file.)

        perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'