Re^6: Unescaped left brace in regex is passed through in regex

Due to the way the single-quote string constructor handles backslashes (escapes), the \\ will in this case compile to a single literal backslash. See Quote and Quote-like Operators and the discussion of q/STRING/ in Quote-Like Operators.

Thanks for your comment, AM. Your link is a good read and worth reposting. I thought that the collapsing of backslashes was done by the OS in resolving paths. I was unaware that perl did it.

do you dispute that there is a left-curly (and a right-curly) in the \x{A3f4} string? What else would you call it/them?

I do not dispute that, so this string itself never represents a left curly brace, rather it has a left curly brace in it.

I would call it (or in this case \x{A3f4}) "the string compiled from '\\x{A3f4}'"

Ok. From the above source we have:

\x{263A}     [1,8]  hex char          (example shown: SMILEY)
\x{ 263A }          Same, but shows optional blanks inside and
                    adjoining the braces
\x1b         [2,8]  restricted range hex char (example: ESC)
[download]

So, I think "aha, it's a hex representation", but then I can't get there with the REPL:

  DB<1> $str2='\\x{263}'                                              
+          

  DB<2> p $str2                                                       
+          
\x{263}
  DB<3> p hex $str2                                                   
+          
0
  DB<4> print hex $str2                                               
+          
0
[download]

I would expect to see a smiley face rather than zero. This is a head-scratcher:

DB<6> $str3='\\\\\\\x{aF}'                                            
+        

  DB<7> p $str3                                                       
+          
\\\\x{aF}
  DB<8> p hex $str3                                                   
+          
0
  DB<9> print hex $str3                                               
+          
0
[download]

$str3 goes from 7 to 4 backslashes when compiled(?). But I get zero for a hex value no matter what I try:

  DB<10> $str4='\x{aF}'                                               
+          

  DB<11> p $str4                                                      
+          
\x{aF}
  DB<12> print hex $str4                                              
+          
0
  DB<13> print hex 'aF'                                               
+          
175
[download]

How do I tease 175 out of $str4?

The \x part has nothing to do with the /x or /xx regex modifiers.

That part is clearer now. I have that backslash/forwardslash disphoria going on now where I can hardly see the difference and it looks like a toothpick war. I get the occasional billiken that I read or write the wrong way.

Comment on Re^6: Unescaped left brace in regex is passed through in regex Select or Download Code

Replies are listed 'Best First'.
Re^7: Unescaped left brace in regex is passed through in regex by LanX (Saint) on Jun 08, 2022 at 09:28 UTC
> So, I think "aha, it's a hex representation", but then I can't get there with the REPL: you are still confusing interpolation (double-quotes) from literal strings (single-quotes) `DB<28> p $str1 = "\x{41}" # interpolation A DB<29> p $str2 = '\x{41}' # literal \x{41} DB<30> p $str2 = '\\x{41}' # literal but escaping escaping \ \x{41}` [download] now, the double escape in line 30 is playing safe, because there is a difference between `\\'` and `\'` BUT this `\x{ 263A } Same, but shows optional blanks inside and adjoining the braces` [download] doesn't work for me! (oO ???) `DB<31> p " \x{ 41 } " ^@` [download] Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery}	[reply] [d/l] [select]
Re^7: Unescaped left brace in regex is passed through in regex by AnomalousMonk (Archbishop) on Jun 08, 2022 at 02:38 UTC
Some random responses... do you dispute that there is a left-curly (and a right-curly) in the \x{A3f4} string? What else would you call it/them? I do not dispute that, so this string itself never represents a left curly brace, rather it has a left curly brace in it. Oh, so you were thinking that `"\x{A3f4}"` when compiled double-quotishy into a string and then printed should print a left-curly! I follow you a little better now. My terminal is not configured for Unicode (as I assume this character to be) right now, so I cannot confirm what it will print, and I'm reluctant to launch myself into Unicode-land on-line to find out. However, I agree that the escape sequence `\x{A3f4}` when compiled double-quotishly (e.g., `"ab\x{A3f4}cd"`) will compile to some character. But the single-quote-compiled string `'\x{A3f4}'` will always be literally `\x{A3f4}` and nothing else. It's important to understand how backslashes (escapes) are compiled in single- and double-quoted strings. Consider the following: `Win8 Strawberry 5.8.9.5 (32) Tue 06/07/2022 12:17:53 C:\@Work\Perl\monks >perl use strict; use warnings; print '-\-\\-\\\-\\\\-\\\\\-\\\\\\-\\\\\\\-\\\\\\\\-'; ^Z -\-\-\\-\\-\\\-\\\-\\\\-\\\\-` [download] Why do `'\\\\\\\'` and `'\\\\\\\\'` (7 and 8 backslashes, respectively) both compile to and print as four backslashes? How would this be different if compiled as a double-quoted string? `DB<1> $str2='\\x{263}'` This compiles to (and prints) the literal string `\x{263}` or literal-backslash, literal-lowercase-x, literal-left-curly, literal-2, literal-6, literal-3, literal-right-curly. The hex built-in cannot interpret a string in this format (and so returns zero (update: and a warning)), but can in "proper" format: `Win8 Strawberry 5.8.9.5 (32) Tue 06/07/2022 22:09:02 C:\@Work\Perl\monks >perl use strict; use warnings; my $h1 = 'A3f4'; my $h2 = 'xA3f4'; print hex 'A3f4', "\n"; print hex $h1, "\n"; print hex 'xA3f4', "\n"; print hex $h2, "\n"; print hex '\xA3f4', "\n"; print hex '\x{A3f4}', "\n"; ^Z 41972 41972 41972 41972 Illegal hexadecimal digit '\' ignored at - line 13. 0 Illegal hexadecimal digit '\' ignored at - line 14. 0` [download] `DB<10> $str4='\x{aF}'` ... How do I tease 175 out of $str4? We know that `\x{aF}` will not be interpreted by hex as a hex number. One way to extract the hex substring: `Win8 Strawberry 5.8.9.5 (32) Tue 06/07/2022 22:25:09 C:\@Work\Perl\monks >perl use strict; use warnings; my $str = '\x{aF}'; $str =~ m{ \A \\ x \{ ([[:xdigit:]]+) \} \z }xms; my $hex_digits = $1; print ">$hex_digits< \n"; my $hex_number_in_decimal = hex $hex_digits; print "$hex_number_in_decimal \n"; ^Z >aF< 175` [download] Update: Another approach: `Win8 Strawberry 5.8.9.5 (32) Sat 06/11/2022 15:18:47 C:\@Work\Perl\monks >perl use strict; use warnings; my $str = '\x{aF}'; my ($hex_digits) = $str =~ m{ [[:xdigit:]]+ }xmsg; my $hex_number_in_decimal = hex $hex_digits; print "'$hex_digits' == $hex_number_in_decimal decimal \n"; ^Z 'aF' == 175 decimal` [download] This approach can be useful when a string or record has been "validated" as to its structure and you know that certain substrings or fields are unambiguously present: these substrings/fields can then be easily and quickly extracted. Note the `/g` modifier on the `m//` match. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]