in reply to Interpolation differences between Strings and Regular Expressions

You've run into probably the most vague area of Perl parsing DWIMery. It is the only place in the Perl source code that mentions "weigh". How to interpret such things in a regex is determined by weighing several different criteria so there is no easy explanation as to what Perl will choose. For example, I'm disappointed that in a regex Perl chooses to interpret [$] as the start of a character class followed by the contents of the $] variable. I think it got that DWIM aspect just wrong.

But in your case, I think Perl got the DWIM correct. The string case is fairly straight-forward and mostly just greedy parsing so it always pulls in the trailing {...} to make a hash deref unless you do something to tell it not to.

For the regex case, {5} looks more like a quantifier than like a hash key because a hard-coded number as a hash keys is a bit unlikely, though this isn't a slam-dunk winner.

- tye        

Replies are listed 'Best First'.
Re^2: Interpolation differences between Strings and Regular Expressions (weight)
by grinder (Bishop) on Jun 13, 2007 at 06:28 UTC
    mostly just greedy parsing so it always pulls in the trailing {...} to make a hash deref

    Similarly, if the lexer in the regexp engine doesn't find a closing curly, the opening curly automatically loses its meta aspect...

    print "a{5" =~ /a{5/

    ... prints 1. This could be the source of annoying errors if you're not careful. The explanation I received was that in terms of costs and benefits, to maintain sufficient context to maintain the ability to report the error would be too much of overhead during the parse. Or something like that, I'm a little hazy on the details by now.

    Nor can I recall having been bitten by this behaviour, so the decision as it stands was probably correct.

    • another intruder with the mooring in the heart of the Perl

      Actually, it was purely a decision to maintain some backward compatibility with earlier regex implementations that didn't treat curlies as metacharacters. In retrospect, probably a mistake, but traditional regex syntax is full of such mistakes, where rather than having a consistent backslashing rule, you have to put metacharacters where they don't make sense to match them literally, such as the infamous []x-] character class, which cannot be written in any other order, because the literal hyphen must come either first or last, and the right bracket may only come first, so the hyphen must come last.