Improve readability of Perl code. Naming reference variables.

Hello Monks!

I've been learning Perl for some years now. At the same time, moving from writing awk scripts to writing Perl scripts, I have found Perl to be an amazing resource for getting things done.

Still, I have some minor issues with the language design that I have not yet been able to understand/resolve. This is what I want to discuss here.

Background

It sometimes bugs me that it is so difficult to write Perl code that is readable (easy to follow) when working with references. For example, if I see a variable $var in the middle of some code, it can be a scalar variable, a scalar reference, an array reference, a hash reference, and so on. Hence, I often end up guessing or having to scan source code nearby in order to determine the type of the variable. I find this workflow less than optimal. Would it not be better if the variable could (optionally) be made self-documenting with respect to reference type?

In the book Perl Best Practices, the problem is mentioned in another setting, and the solution suggested is to add the suffix _ref to the variable name. So one could write,

  $var_href = { a => 1 };
[download]

to create a hash ref, and

  $var_aref = [ 1, 2, 3];
[download]

to create an array reference.

However, a problem with this convention could be that the suffix is not optional. You should not be forced to used the more verbose form of the variable name. I think, the programmer should have a choice to decide whether he finds it advantageous to include the suffix at given place or not. For example, when declaring the variable as

$var = [ 1, 2, 3 ];
[download]

it is rather obvious that it is an array reference, and there is no need to write:

$var_aref = [ 1, 2, 3 ];
[download]

The latter is in my opinion too verbose. However, if the reference is just defined as

my $var;
[download]

it would often be better to include the suffix. If there is no indication on the next lines or so whether $var will be used as an array reference or not, it would be more readable to define it as

my $var_aref;
[download]

A new idea for reference variable naming syntax

So this lead me to an idea: Could the postfix dereferencing syntax be extended for this use case?

The Postfix Dereferening Syntax (PDS) was introduced as experimental in 5.20. And starting from 5.24 it is included in the Perl language by default.

Currently PDS is used for dereferencing:

my @array = $var->@*;
[download]

Notice that the PDS includes a star after the sigil. It is a syntax error not to include the star. But let's say for the moment that if the star was omitted, the dereferencing was to be simply ignored instead. So

my $var->@;
[download]

would mean the same as

my $var;
[download]

and produce no syntax error.

Let's denote this new syntax by Optional Postfix Reference Declaration Syntax (OPRDS). So when using OPRDS, should it be entirely up to the user to ensure that he used the correct sigil. For example, if I write

$var->@  = 12;
[download]

when I really meant

$var->@ = [ 12 ];
[download]

should it produce a compile time error? I think it would be very helpful if the compiler could use OPRDS to check for consistency. But it might be difficult to implement? I do not know. If it is difficult to implement, some alternatives might be used instead? I don't know much of Perl internals, so this is a point where I need help.

When I started out with this idea, compile time type-checking was not on my mind at all. But I see now that OPRDS would offer the opportunity for stricter type checking.

But type checking was not the main issue I wanted to discuss. What I would like to discuss is how to deal with reference variable names. Reading and understanding written Perl code can be difficult since the $ sigil can be used for many data types. How could this situation be improved?

Back to Meditations