According to the perlunicode manpage, use utf8 allows the use of unicode (UTF-8 encoded) characters in not only string literals, but identifier names.

So I had the urge to try the following program:
use utf8; my $€ = 1; print $€;
And it failed with this error message:
Malformed UTF-8 character (unexpected end of string) at utf8test.pl li +ne 3. Unrecognized character \x82 in column 5 at utf8test.pl line 3.
This version, however, ran correctly:
use utf8; my $á = 1; print $á;
So it seems that while certain unicode characters can be in variable names, others cause an error.

I wrote a little script to test which characters are supported.
#!/usr/bin/perl use strict; use warnings; use utf8; use charnames ':full'; my ($fh, $rfh); my ($vname, $errorcode, $message, $col, $byte); open($rfh, '>>:encoding(UTF-8)', 'utf8report.txt') || die 'Error openi +ng file'; for (0x100..0x9fff) { # You may want to change these numbers if the sc +ript runs for too long open($fh, '>:encoding(UTF-8)', 'utf8test.pl') || die 'Error openin +g file'; print $fh "use utf8;\n"; $vname = pack "U", $_; print $fh "my \$$vname = $_;\nprint \$$vname;"; close $fh; #system "perl", "-c", "utf8test.pl"; $message = `perl -c utf8test.pl 2>&1`; ($byte, $col) = $message =~ /character \\x(..).*column (\d)/; $errorcode = ($? >> 8) ? "FAIL at byte ".($col-4)."($byte)" : "PAS +S"; print $rfh "$_\t$errorcode, character ".(charnames::viacode($_))." +\n"; print $_-$_%100,"\r"; } print "\n"; close $rfh;
The results (omitted here) show that indeed, certain characters don't seem to be eligible as variable names.

The question, then, is why?

Of course, this is not much point in asking this, as using unicode in variable names is still a bad idea, according to many. Yet, a perl hacker should be able to use the euro sign (for example) as a variable name if he so chooses.

(Test was run on a Windows XP with Camelbox Perl 5.10.0)

In reply to UTF-8 characters in variable names: some characters are not allowed by kikuchiyo

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.