Constructing Context

Important headnote: The following post contains a lot of poorly written things. It contains several misleading things. Most importantly, it uses "list" in a strict sense, but does not define this strict sense. It should be completely rewritten after a lot of meditation on my part. It makes interesting reading, but should be taken with at least a grain of salt.

This post was originally written as a reply to a SoPW asking why print(reverse("foo")) does not (appear to) work, to give some context (pun intended) to the following. It wasn't posted there, because I decided it was worthy of making a tutorial, or at least a meditation, and then preceded to forget about it... until now, when I realized that if I waited until I got around to divorcing it from it's context correctly it would never get posted.

Your fundamental misunderstanding about context is simple. Context does not work from the inside out, it works from the outside in. (Where, occasionally, I'm going to define the @a in @a = $b to be outside the $b, just because it lets me use a more convenient word.)

I'm also going to use «this is a list» to show what's a list and what isn't, because it's the most easily-typed set of brackets that doesn't have a meaning in perl's syntax. This is because most Americans would be hard-pressed to type it. (They're on AltGr-y and AltGr-x on a German keyboard. (Those two keys are next to each-other.)) Oh, they should show as «=« and »=». If not, your browser isn't interpreting the output from PM as latin-1, which it should be. Make sure your browser is set up correctly. (It may need to be set to autodetect, latin-1, iso-8859-1, or simply "western", and should not be set to utf8 or unicode.)

So, it's really fairly simple. If something can take multiple arguments, it usually imposes list context on it's arguments. If not, it usually imposes scalar context on them.

In this example, print can take multiple arguments. Thus, in print(reverse($foo)), the $foo is in list context: print(«reverse(«$foo»)»). Note that the $foo is in list context there too. It doesn't much matter, in this case -- a scalar in list context is the same as that scalar in scalar context (ignoring the possibility of ties and overloads. They get ignored a lot, and I'm going to continue to ignore them.).

So, what can we do about it? Well, the first thing we can do is put a scalar in there: print(scalar(reverse($foo))). This ends up being print(«scalar(reverse(«$foo»))). Note that the reverse here is in scalar context, not list context, though the arguments to reverse are still in scalar context. reverse is nice enough to concatenate the arguments (all one of them, in this case), and reverse that string.

We can show what I've been saying so far by creating something that clearly indicates what context we're in, like so:

   return 'LIST' if wantarray;
   return 'SCALAR' if defined wantarray;
   return 'VOID';
}
[download]

The (misnamed) wantarray operator tells us what context we're called in. Note, BTW, that the return 'VOID'; is useless here, as by definition, if we're in void context, nobody is paying any attention to what we return. (I actually wrote this function much earlier in this discussion, to check what I was writing against reality.)

So what does print showcontext, scalar(reverse(showcontext, "foo"))' tell us? LISToofTSIL. reverse's arguments are in list context, so showcontext returned LIST. "foo", of course, returned "foo". reverse concatenated them, got LISTfoo, reversed that into oofTSIL. print's arguments are "LIST" and "oofTSIL", which get contacted into LISToofTSIL. Everywhere we check, we're in list context. This just shows that we aren't checking the right place.

Let's try something a bit more interesting, shall we? How about print showcontext, scalar(showcontext, reverse(showcontext, "foo")). Well, that isn't as interesting as we'd hoped. In fact, prints out exactly what it did before we added that extra showcontext. Modifying our showcontext into

sub showcontext{
  if (wantarray) {
    warn "LIST"; return "LIST"
  };
  if(defined wantarray) {
    warn "SCALAR"; return "SCALAR"
  };
  warn "VOID"; return "VOID"
}
[download]

lets us see what's going on a bit better. It turns out that «scalar(showcontext, reverse(«...»)) runs that first showcontext in void context! What's going on here?

Well, it turns out that comma has nothing to do with lists at all. It's being run in scalar context, which means that it doesn't separate elements of a list. Instead, it's the "comma operator", in scalar context. perlop says that this "evaluates its left argument, throws that value away, then evaluates its right argument and returns that value." Since the left hand side is being thrown away, it's in void context. The right hand side is run in scalar context, as we can see with print showcontext, scalar(showcontext, showcontext).

(I've often thought that we should get rid of the scalar comma operator completely, because it causes much more confusion then usefulness. This isn't to say it's unused -- in fact, a version of showcontext I ended up not giving here used it -- it's to say that most uses of it could be more cleanly written as a full-fledged block, instead of a bunch of expressions chained together with an operator that looks too much like a piece of syntax. Unfortunately, too many people expect it to be there to change now, and perl6 changes enough things that my arguments don't really hold water.)/p>

So, what else have we got? There's assignment. Normally, assignment is simple, if you remember the caveat I mentioned in my first paragraph, about how I was defining "outside". In an assignment, if you have an array on the right hand side that you're assigning to, you've got list context on the right hand side. This comes up many places, some of them less obvious then others. @a = «$b» is the obvious one. But this rule also crops up in some unexpected places. You expect list context on my($a, $b)=«foo»;. But you may not expect it on my($a)=«foo»;. The rule is that on the right hand side of an assignment, () only serve to group arguments to a function, and to control precedence. (The same places they're used in normal algebra.) However, on the left hand side, they are also used to create lists. Remember that this holds only on the LHS of an assignment operator!)

This is what lets us use the $count = () = m/foo/ idiom. If we just said $count = m/foo/, we'd get the first match in $count, which isn't what we were after. This is because the m// operator (it is an operator, it just looks strange) is in scalar context, and that's what regex matches in scalar context do.

On the other hand, m// in list context returns a list of matches. @a = «m/foo/» does the regex match in list context, then assigns that list to @a.

$count = @a evaluates @a in scalar context. Now, note that we haven't mentioned what that does. We talked about what $a does in list context, but that isn't the same thing. We talked about what , does in scalar context, but it turns out that isn't the same thing either. @a is an array, ($foo, $bar) looks like a list in scalar context, but isn't, because, repeat after me, "there is no such thing as a list in scalar context". That's a pair of parentheses, and a scalar comma operator. It turns out that an array in scalar context returns something quite useful -- it's length, or it's content joined with the punctuation variable $", depending if it is being used as a number or as a string. This is what lets if (!@a) {print "\@a is empty!"} work.

So, that explains what @a = «m/foo/»; $count = @a; does. From there, we've almost got it. Perl's precedence and context rules make $count = @a = m/foo/; be equivalent to $count = (@a = «m/foo/)»;. That is, the match operator is in list context, and assigns to @a. That array is then taken in scalar context, yielding it's count, and that is assigned to $count.

So, now, what changes when that @a is replaced by ()? The surprising answer is "nothing". (Well, beyond the obvious that @a no longer gets the list of matches, but if I said that it wouldn't have nearly as much force.) $count still gets the count of matches.

What happens, as far as I can explain it, is this. The assignment to () still forces the match operator to be in list context. () is still a list, even if it happens to have no elements. That list of matches is assigned to (), or at least as much of it as fits. (None of it fits). The value of an assignment operation is value from the RHS (right hand side) of the assignment, regardless of how much of it fit in the thing being assigned to. So we now have the array of matches. That array gets assigned to $count. But $count is a scalar, so we've got an array in scalar value, which is it's length.

Update: Added headnote, and missing first line of text.

Update: Fixed another typo or three, thanks, liz, ysth, diotalevi.

Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).

20031218 Edit by Corion: Added readmore tag

Comment on Constructing Context Select or Download Code

Replies are listed 'Best First'.
Re: Constructing Context by diotalevi (Canon) on Dec 17, 2003 at 03:55 UTC
It turns out that an array in scalar context returns something quite useful -- it's length, or it's content joined with the punctuation variable $,, depending if it is being used as a number or as a string. This is what lets if (!@a) {print "\@a is empty!"} work. No. You said the two cases are numeric @a and string @a. That's actually @a interpolated and scalar @a. You might even call interpolated-@a as an single-case only type of context except that if you or I did it we'd be wrong. Its also scalar context , it just creates a different optree.	[reply]
Re: Re: Constructing Context by theorbtwo (Prior) on Dec 17, 2003 at 04:41 UTC
You're right, I was wrong. It turns out that `"@foo"` is not best considered context at all, or, at best, it's a special sort of context that only matters here. When an array is used in scalar context, it returns it's length. Period. However, when an array is interpolated into a string, it magicaly does not get evaluated in scalar context, then inserted into the string. Instead, it's magicly equivlent to `join($", @foo)`. Indeed, the internal operation tree generated by perl for `"@foo"` looks almost exactly like that for `join($", @foo)`, the difference being a "stringify" operation, which seems to mostly shuffle things around a little. Much thanks to diotalevi for pointing this out, pointing me at `-MO=Concise`'s take on what's happening, and for his other critisims on the node. Update: $", not $,, thanks again, demerphq. Warning: Unless otherwise stated, code is untested. Do not use without understanding. Code is posted in the hopes it is useful, but without warranty. All copyrights are relinquished into the public domain unless otherwise stated. I am not an angel. I am capable of error, and err on a fairly regular basis. If I made a mistake, please let me know (such as by replying to this node).	[reply] [d/l] [select]
Re: Constructing Context by Ovid (Cardinal) on Dec 17, 2003 at 17:09 UTC
Nice writeup. People keep getting confused by context and it's a fairly simple idea, even if their are a few edge cases that throw people off. And while on the topic of context, if I keep responding to posts like this, people will think I'm planting shills in the audience :) Naturally, that means that what follows is a shameless plug for a module that makes dealing with context much easier (for subroutines). Context is a great thing, but it trips up many people. It trips use people using other's code: `my $foo = some_func(); my ($foo) = some_func(); # may or may not behave the same!` [download] It also trips up people writing code for others to use: `sub some_func { # do stuff return wantarray ? @results : \@results; # is void context OK? }` [download] To solve this problem from the perspective of those writing code for others to use, I posted an RFC for Sub::Attributes. After some great feedback, I uploaded a module to the CPAN named Attribute::Context. It's easy to use. If you want to return an array, but return a reference in scalar context and warn about using it in void context, it used to be that you would have to write something like this: `sub some_func { # do stuff return wantarray ? @results : defined wantarray ? \@results : warn "Useless void context"; }` [download] Not only is that ugly, it's also confusing and easy to write incorrectly. The if-else chain is hardly better: `sub some_func { # do stuff if (wantarray) { return @results; elsif (defined wantarray) { return \@results; else { warn "this is tedious"; } }` [download] If you want all of your subroutines to have this behavior, it's going to get awfully tedious to code that every time. Now, you can just write this: `use Attribute::Context; sub some_func : Arrayref(WARNVOID) { # do stuff return @results; }` [download] That's much easier to write, easier on the eyes, and is less likely to be buggy! I also got rid of the nasty `Iterator` in exchange for a `Custom` attribute that returns an object in scalar context: `sub some_func : Custom(Some::Class) { # do stuff return @results; } # scalar context returns Some::Class->new(\@results)` [download] I'll update the `Custom` attribute to be more flexible if people find it useful. There are also attributes for `First` (like `CGI::param`), `Last` and `Count`. More attributes may be added if requested. Cheers, Ovid New address of my CGI Course.	[reply] [d/l] [select]