On how to recognize Rvalue or Lvalue reference and if-it-has-a-name rule

This is one of the most common “rules of thumb” used to explain what is the difference between lvalues and rvalues.

The situation in C++ is much more complex than that so this can’t be nothing but a rule of thumb. I’ll try to resume a couple of concepts and try to make it clear why this issue is so complex in the C++ world. First let’s recap a bit what happened once upon a time

At the beginning there was C

First, what “lvalue” and “rvalue” used to mean originally, in the world of programming languages in general?

In a simpler language like C or Pascal, the terms used to refer to what could be placed at the Left or at the Right of an assignment operator.

In a language like Pascal where the assignment is not an expression but only a statement, the difference is pretty clear and it’s defined in grammatical terms. An lvalue is a name of a variable, or a subscript of an array.

That’s because only these two things could stand at the left of an assignment:

i := 42; (* ok *)
a[i] := 42; (* ok *)
42 := 42; (* no sense *)

In C, the same difference applies, and it is still pretty much grammatical in the sense that you could look at a line of code and tell if an expression would produce an lvalue or an rvalue.

i = 42; // ok, a variable
*p = 42; // ok, a pointer dereference
a[i] = 42; // ok, a subscript (which is a pointer dereference anyway)
s->var = 42; // ok, a struct member access

So what changed in C++?

Little languages grow up

In C++ things become much more complex and the difference is not grammatical anymore but involves the type checking process, for two reasons:

  • Everything could stay at the left of an assignment, as long as its type has a suitable overload of operator=
  • References

So this means that in C++ you can’t say if an expression will produce an lvalue only by looking at its grammatical structure. For example:

f() = g();

is a statement that would have no sense in C but can be perfectly legal in C++ if, for example, f() returns a reference. That’s how expressions like v[i] = j work for std::vector: the operator[] returns a reference to the element so you can assign to it.

So what’s the point of having a distinction between lvalues and rvalues anymore? The distinction is still relevant for basic types of course, but also to decide what can be bound to a non-const reference.

That’s because you don’t want to have legal code like:

int &x = 42;
x = 0; // Have we changed the meaning of a natural number??

So the language specifies carefully what is an lvalue and what isn’t, and then says that only lvalues can be bound to non-const references. So the above code is not legal because an integer literal is not an lvalue so a non-const reference cannot be bound to it.

Note that const references are different, since they can bind to literals and temporaries (and local references even extend the lifetime of those temporaries):

int const&x = 42; // It's ok

And until now we’ve only touched what already used to happen in C++98. The rules were already more complex than “if it has a name it’s an lvalue”, since you have to consider the references. So an expression returning a non-const reference is still considered an lvalue.

Also, other rules of thumb mentioned here already don’t work in all cases. For example “if you can take it’s address, it’s an lvalue”. If by “taking the address” you mean “applying operator&“, then it might work, but don’t trick yourself into thinking that you can’t ever come to have the address of a temporary: The this pointer inside a temporary’s member function, for example, will point to it.

What changed in C++11

C++11 puts more complexity into the bin by adding the concept of an rvalue reference, that is, a reference that can be bound to an rvalue even if non-const. The fact that it can only be applied to an rvalue make it both safe and useful. I don’t think its needed to explain why rvalue reference are useful, so move on.

The point here is that now we have a lot more of cases to consider. So what is an rvalue now? The Standard actually distinguish between different kinds of rvalues to be able to correctly state the behavior of rvalue references and overload resolution and template argument deduction in the presence of rvalue references. So we have terms like xvalue, prvalue and things like that, which make things more complex.

What about our rules of thumb?

So “everything that has a name is an lvalue” can still be true, but for sure it isn’t true that every lvalue has a name. A function returning a non-const lvalue reference is an lvalue. A function returning something by value creates a temporary and it is an rvalue, so is a function returning an rvalue reference.

What about “temporaries are rvalues”. It’s true, but also non-temporaries can be made into rvalues by simply casting the type (as does std::move).

So I think that all these rules are useful if we keep in mind what they are: rules of thumb.
They’ll always have some corner case where they don’t apply, because to exactly specify what an rvalue is and what isn’t, we can’t avoid using the exact terms and rules used in the standard. That’s why they were written for!

Leave a Comment