in reply to Re^6: Unescaped left brace in regex is passed through in regex
in thread Unescaped left brace in regex is passed through in regex
Some random responses...
do you dispute that there is a left-curly (and a right-curly) in the \x{A3f4} string? What else would you call it/them?I do not dispute that, so this string itself never represents a left curly brace, rather it has a left curly brace in it.
Oh, so you were thinking that "\x{A3f4}" when compiled double-quotishy into a string and then printed should print a left-curly! I follow you a little better now. My terminal is not configured for Unicode (as I assume this character to be) right now, so I cannot confirm what it will print, and I'm reluctant to launch myself into Unicode-land on-line to find out. However, I agree that the escape sequence \x{A3f4} when compiled double-quotishly (e.g., "ab\x{A3f4}cd") will compile to some character. But the single-quote-compiled string '\x{A3f4}' will always be literally \x{A3f4} and nothing else.
It's important to understand how backslashes (escapes) are compiled in single- and double-quoted strings. Consider the following:
Why do '\\\\\\\' and '\\\\\\\\' (7 and 8 backslashes, respectively) both compile to and print as four backslashes? How would this be different if compiled as a double-quoted string?Win8 Strawberry 5.8.9.5 (32) Tue 06/07/2022 12:17:53 C:\@Work\Perl\monks >perl use strict; use warnings; print '-\-\\-\\\-\\\\-\\\\\-\\\\\\-\\\\\\\-\\\\\\\\-'; ^Z -\-\-\\-\\-\\\-\\\-\\\\-\\\\-
DB<1> $str2='\\x{263}'
This compiles to (and prints) the literal string \x{263} or literal-backslash, literal-lowercase-x, literal-left-curly, literal-2, literal-6, literal-3, literal-right-curly. The hex built-in cannot interpret a string in this format (and so returns zero (update: and a warning)), but can in "proper" format:
Win8 Strawberry 5.8.9.5 (32) Tue 06/07/2022 22:09:02 C:\@Work\Perl\monks >perl use strict; use warnings; my $h1 = 'A3f4'; my $h2 = 'xA3f4'; print hex 'A3f4', "\n"; print hex $h1, "\n"; print hex 'xA3f4', "\n"; print hex $h2, "\n"; print hex '\xA3f4', "\n"; print hex '\x{A3f4}', "\n"; ^Z 41972 41972 41972 41972 Illegal hexadecimal digit '\' ignored at - line 13. 0 Illegal hexadecimal digit '\' ignored at - line 14. 0
DB<10> $str4='\x{aF}'
...
How do I tease 175 out of $str4?
We know that \x{aF} will not be interpreted by hex as a hex number. One way to extract the hex substring:
Win8 Strawberry 5.8.9.5 (32) Tue 06/07/2022 22:25:09 C:\@Work\Perl\monks >perl use strict; use warnings; my $str = '\x{aF}'; $str =~ m{ \A \\ x \{ ([[:xdigit:]]+) \} \z }xms; my $hex_digits = $1; print ">$hex_digits< \n"; my $hex_number_in_decimal = hex $hex_digits; print "$hex_number_in_decimal \n"; ^Z >aF< 175
Update: Another approach:
This approach can be useful when a string or record has been "validated" as to its structure and you know that certain substrings or fields are unambiguously present: these substrings/fields can then be easily and quickly extracted. Note the /g modifier on the m// match.Win8 Strawberry 5.8.9.5 (32) Sat 06/11/2022 15:18:47 C:\@Work\Perl\monks >perl use strict; use warnings; my $str = '\x{aF}'; my ($hex_digits) = $str =~ m{ [[:xdigit:]]+ }xmsg; my $hex_number_in_decimal = hex $hex_digits; print "'$hex_digits' == $hex_number_in_decimal decimal \n"; ^Z 'aF' == 175 decimal
Give a man a fish: <%-{-{-{-<
|
|---|