in reply to \x in RE character class (deleted)
I want to know whether a numeric \x escape can be used to represent a character with a codepoint from U+0080 to U+00ff within a regular-expression character class.
Yes, you can. \x80, \x{0080}, \0200, \N{U+0080}, etc all work. You could easily have ascertained that yourself.
$ perl -E'say "\x80" =~ /^\x80\z/ ?"match":"no match"' match
Elsewhere, you've asked the same question but under use encoding 'UTF-8'. The UTF-8 encoding of U+0080 is C2 80, so...
...well, I can't find a way of matching U+0080 with \x or \0, but \N works.
$ perl -E' use encoding "UTF-8"; say "\xC2\x80" =~ /^\N{U+0080}\z/ ?"match":"no match" ' match
Of course, placing the literal character in the regex pattern works too. You can insert it into the source code, or interpolate it in.
$ perl -E' use encoding "UTF-8"; $x = "\xC2\x80"; say $x =~ /^\Q$x\E\z/ ?"match":"no match" ' match
This is a known limitation because the documentation shows a workaround for similar operator tr///. tr/// needs a workaround because it doesn't allow interpolation.
|
|---|