That is, \xE3 followed by \x80 \x87 are individual bytes in a UTF-8 encoded string, which won't match a real \x{2007} in the string.$Ideographic = '(?:\xE3\x80[\x87\xA1-\xA9]|\xE4(?:[\xB8-\xBF][\x80-\xB +F])|\xE5(?:[\x80-\xBF][\x80-\xBF])|\xE6(?:[\x80-\xBF][\x80-\xBF])|\xE +7(?:[\x80-\xBF][\x80-\xBF])|\xE8(?:[\x80-\xBF][\x80-\xBF])|\xE9(?:[\x +80-\xBD][\x80-\xBF]|\xBE[\x80-\xA5]))';
However, my test data are normal ASCII range characters, and that doesn't succeed either, though it starts $BaseChar = '(?:[a-zA-Z]|\xC3[\x80-\x9.... So I don't know everything that's wrong with it, but it doesn't work at all (see below) if utf8 is used.
use strict; use warnings; use utf8; # comment this line out and it matches use XML::RegExp; my $name= 'timestamp'; # contains plain ASCII letters only! my $result= $name =~ /^$XML::RegExp::Name$/o; print "result is $result\n";
In reply to XML::RegExp doesn't work (Re: Is some string a legal XML Name?)
by John M. Dlugosz
in thread Is some string a legal XML Name?
by John M. Dlugosz
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |