syphilis has asked for the wisdom of the Perl Monks concerning the following question:

Hi,
I'm running perl 5.12.2 on Windows:
use warnings; %h = ( '512_x64' => '24', 'world' => 'hello'); print $h{world}, "\n"; # line 5 print $h{512_x64}, "\n"; # line 6
The output:
Misplaced _ in number at try.pl line 6. hello Use of uninitialized value in print at try.pl line 6.
Given that I don't have to put the string world in quotes at line 5, I'm a bit peeved that I *do* have to put the string 512_x64 in quotes at line 6.

Is that the way it's supposed to be ?

Cheers,
Rob

Replies are listed 'Best First'.
Re: Hash keys not DWIMming
by kcott (Archbishop) on Oct 07, 2010 at 05:19 UTC

    Underscores in numbers are quite valid: see perldata: Scalar value constructors.

    If you put a non-digit character in front of 512_x64, the problem goes away.

    As a side issue relating to your code, there's no my in front of %h. If you add use 5.12.2; (or indeed any version >= 5.11), strictures are turned on automatically. They are not turned on simply by having that version running.

    Here's my test:

    #!perl use 5.12.0; use warnings; my %h = ( 'a512_x64' => '24', 'world' => 'hello'); print $h{world}, "\n"; # line 5 print $h{a512_x64}, "\n"; # line 6

    which outputs:

    C:\_\tmp>num_underscore_problem.pl hello 24

    That's 5.12.0 running under Windows.

Re: Hash keys not DWIMming
by davido (Cardinal) on Oct 07, 2010 at 05:14 UTC

    Perl is treating '512_x64' as an expression, where 'x' is the x operator. See the output from the following:

    perl -MO=Deparse -e "$h{512_x64} = 10; print $h{512_x64}, qq/\n/;"

    Though the syntax checks out 'ok', deparse makes it clear that Perl is seeing "512_x64" as the expression, 512 x 64.

    As a refresher, 512 x 64 is the same as 512512512512512512512512...... where the sequence '512' is repeated 64 times. This is an unusual case where 512_ is treated as a number. Consequently the '_' is silently dropped, and then the

    x</x> operator stringifies it, so that the number 512 is stringified a +nd repeated 64 times.</p> <p>In your definition (ie, <c>%h = ( 512_x64 => 'value' );
    , the => operator has the effect of wrapping "512_x64" in single quotes, so on that line you are properly populating the hash. But later on when you recall the hash value, the interpretation of $h{ ...... } is not protected by single quotes, explicitly or implicitly.


    Dave

      the => operator has the effect of wrapping "512_x64" in single quotes,
      No it doesn't. The fat comma doesn't follow different rules of autoquoting than hash keys do. 512_x64 isn't a valid identifier name, so it's not autoquoted. Not as a hash key, and not with a fat comma.

      Note also that autoquoting doesn't mean putting something in single quotes. Or double quotes. It just means the bare word is to be taken as a string, instead of something else.

        I'm not arguing against your statement: "The fat comma doesn't follow different rules of autoquoting than hash keys do.".

        However, the target of autoquoting may be handled differently prior to autoquoting.

        Here's an example with constants:

        $ perl -wE 'use constant XXX => q{a}; my %x = (XXX() => 123); say $x{X +XX()};' 123 $ perl -wE 'use constant XXX => q{a}; my %x = (XXX() => 123); say $x{+ +XXX};' 123 $ perl -wE 'use constant XXX => q{a}; my %x = (+XXX => 123); say $x{XX +X()};' Use of uninitialized value in say at -e line 1. $ perl -wE 'use constant XXX => q{a}; my %x = (+XXX => 123); say $x{+X +XX};' Use of uninitialized value in say at -e line 1.

        Note how both XXX() and +XXX can be used as the hash key but only XXX() works on the LHS of the fat comma.

        This example is run under 5.12.0 but I recall the same behaviour under 5.8 (definitely) and 5.6 (probably).

        I have a vague recollection that there's another instance of this type of behaviour but I can't think what it is right now.

        -- Ken

      Consequently the '_' is silently dropped
      Really? Then what does the
      Misplaced _ in number at try.pl line 6.
      in the OP's output mean?
Re: Hash keys not DWIMming
by ig (Vicar) on Oct 07, 2010 at 08:14 UTC

    I guess from your post that when you write $h{512_x64} what you mean is $h{'512_x64'}. But, as you have noticed, perl doesn't do this for you.

    In perldata Scalar value constructors, about the fifth paragraph (depending how you count), just before 'Version Strings', it says:

    In fact, an identifier within such curlies is forced to be a string, as is any simple identifier within a hash subscript. Neither need quoting. Our earlier example, $days{'Feb'} can be written as $days{Feb} and the quotes will be assumed automatically. But anything more complicated in the subscript will be interpreted as an expression. This means for example that "$version{2.0}++" is equivalent to "$version{2}++", not to "$version{'2.0'}++".

    And, in perldata Variable names it says:

    Values are usually referred to by name, or through a named reference. The first character of the name tells you to what sort of data structure it refers. The rest of the name tells you the particular value to which it refers. Usually this name is a single identifier, that is, a string beginning with a letter or underscore, and containing letters, underscores, and digits.

    Note the definition of identifier and remember that anything more complicated than an identifier within the subscript of a hash does not get implicit quotes. Knowing this, maybe you will not write $h{512_x64} when you mean $h{'512_x64'}, and perl can, once again, do what you mean.

    Note also that there are many times when it is very convenient that what is inside the curlies of a hash subscript is evaluated as an expression. Hash subscripts would be much less useful if everything inside the curlies was implicitly quoted. Imagine if $key in the following were quoted. You wouldn't see the values of your environment variables or, if you happened to have an environment variable named '$key', you would see its value over and over.

    foreach my $key (sort keys %ENV) { print "$key: $ENV{$key}\n"; }
Re: Hash keys not DWIMming
by sflitman (Hermit) on Oct 07, 2010 at 05:21 UTC
    I agree with Dave that line 6 has a problem, the solution being to put the 512_x64 in single quotes like you did for the hash definition. This code works fine:
    #!/usr/bin/perl use warnings; use strict; my %h = ( '512_x64' => '24', 'world' => 'hello'); print $h{world}, "\n"; # line 5 print $h{'512_x64'}, "\n"; # line 6
    So you can use a key that starts with a digit!
    HTH,
    SSF
      the solution being to put the 512_x64 in single quotes

      Yes, I know that works. It's just that I thought steps had been taken to avoid the need for quoting hash keys .... that's obviously not so and we're left with what looks like a dog's breakfast when it comes to deciding if a key needs to be quoted or not.

      After spending a number of minutes looking at this, the best rule of practice I can come up with is "if the name of the key matches /^[0-9_\.]/ then quote it" - otherwise there's too many "ifs and buts" involved in trying to determine whether it will work without quotes.

      Cheers,
      Rob
        It's just that I thought steps had been taken to avoid the need for quoting hash keys

        From "Learning Perl (5th)", p 95:

        ...the keys are always converted to strings. So, if you used the numeric expression 50/20 as the key, it would be turned into the three-character string "2.5"...

        And on p. 254, in a section titled "Unquoted Hash Keys":

        Perl offers many shortcuts that can help programmers. Here's a handy one: you may omit the quote marks on some hash keys.

        Of course, you can't omit the quote marks on just any key, since a hash key may be any arbitrary string. But keys are often simple. If the hash key is made up of nothing but letters, digits, and underscores without starting with a digit, you may be able to omit the quote marks. This kind of simple string without quote marks is called a bareword, since it stands alone without quotes.

        ...But beware: if there's anything inside the curly braces besides a bareword, Perl will interpret it as an expression.

        It's just that I thought steps had been taken to avoid the need for quoting hash keys .... that's obviously not so and we're left with what looks like a dog's breakfast when it comes to deciding if a key needs to be quoted or not.

        The rule seems pretty straightforward to me. If the hash key is a valid identifier, it doesn't need quoted. Otherwise it does. Has always worked fine for my usage and always does what I want.

Re: Hash keys not DWIMming
by JavaFan (Canon) on Oct 07, 2010 at 07:51 UTC
    Yes, that's the way it's supposed to be. Perl only autoquotes bare words that are valid identifier names. Identifiers cannot start with a number and then have non numbers in them.
Re: Hash keys not DWIMming
by ruzam (Curate) on Oct 08, 2010 at 00:52 UTC

    You may also find it interesting, that what's good for the goose is also good for the gander.

    Just as you can use an unquoted key while using your hash:

    print $h{world}, "\n";

    You can also use an unquoted key while creating your hash:

    my %h = ( '512_x64' => '24', world => 'hello');