I had a couple of questions:
How do I single out a single character with Unicode code point for any operation (say replacement or removal), in the regex do I use \x or \X ? what is the difference between the 2?
Also I had another question on the "eq" operator. Say if $var1 is a byte sequence with the internal UTF-8 flag on, and $var2 is the exact same byte sequence with the UTF8 flag off, what would be the return value on "$var1 eq $var2"? I tested this by reading in a string and doing Encode::_utf8_on($string) on it and then comparing the two. The return value is true, but could some1 explain the behaviour? I would think that one variable having the flag on and the other off would return a FALSE value regardless of the byte sequence therein.
Thanks
Considered: astaines: Re-title 'perl UTF-8 questions'?
Unconsidered: g0n - enough keep votes (Keep: 17, Edit: 7, Reap: 0)