in reply to Re: variables in regex character classes
in thread variables in regex character classes

Thanks for the suggestions. I'll try it.

I had a hunch that i am too clever about using qr//.

I prefer using UTF-8 as the encoding. If i use Unicode, do i still need to set a locale? I didn't set a locale, but i saved all the relevant files as UTF-8 and said

use encoding 'utf8'; ... open my $FILE_HANDLE, "<:utf8", $FILE_NAME;

... And it seems that \b works as intended, even if the rest of the pattern is not so good :)

Character range is problematic for Belarusian, because in Unicode the order of the letters is the Russian standard, and Belarusian is slightly different. So i think that it is safest to simply write all the possible letters.

Any thoughts?...

Replies are listed 'Best First'.
Re^3: variables in regex character classes
by Ieronim (Friar) on Jul 23, 2006 at 17:32 UTC
    If you use Unicode, you don't need to set a locale. And using Unicode is much better than setting a locale.
    But it's better to specify
    use utf8;
    instead of use encoding 'utf8';.

    On Unicode data \b, as well as the /i switch, will work as expected. And if you are not sure about the character ranges, it's of course better to type the alphabet.

    Good luck!

         s;;Just-me-not-h-Ni-m-P-Ni-lm-I-ar-O-Ni;;tr?IerONim-?HAcker ?d;print