Why isn’t a for-loop a compile-time expression?

πάντα ῥεῖ gave a good and useful answer, I would like to mention another issue though with constexpr for.

In C++, at the most fundamental level, all expressions have a type which can be determined statically (at compile-time). There are things like RTTI and boost::any of course, but they are built on top of this framework, and the static type of an expression is an important concept for understanding some of the rules in the standard.

Suppose that you can iterate over a heterogenous container using a fancy for syntax, like this maybe:

std::tuple<int, float, std::string> my_tuple;
for (const auto & x : my_tuple) {
  f(x);
}

Here, f is some overloaded function. Clearly, the intended meaning of this is to call different overloads of f for each of the types in the tuple. What this really means is that in the expression f(x), overload resolution has to run three different times. If we play by the current rules of C++, the only way this can make sense is if we basically unroll the loop into three different loop bodies, before we try to figure out what the types of the expressions are.

What if the code is actually

for (const auto & x : my_tuple) {
  auto y = f(x);
}

auto is not magic, it doesn’t mean “no type info”, it means, “deduce the type, please, compiler”. But clearly, there really need to be three different types of y in general.

On the other hand, there are tricky issues with this kind of thing — in C++ the parser needs to be able to know what names are types and what names are templates in order to correctly parse the language. Can the parser be modified to do some loop unrolling of constexpr for loops before all the types are resolved? I don’t know but I think it might be nontrivial. Maybe there is a better way…

To avoid this issue, in current versions of C++, people use the visitor pattern. The idea is that you will have an overloaded function or function object and it will be applied to each element of the sequence. Then each overload has its own “body” so there’s no ambiguity as to the types or meanings of the variables in them. There are libraries like boost::fusion or boost::hana that let you do iteration over heterogenous sequences using a given vistior — you would use their mechanism instead of a for-loop.

If you could do constexpr for with just ints, e.g.

for (constexpr i = 0; i < 10; ++i) { ... }

this raises the same difficulty as heterogenous for loop. If you can use i as a template parameter inside the body, then you can make variables that refer to different types in different runs of the loop body, and then it’s not clear what the static types of the expressions should be.

So, I’m not sure, but I think there may be some nontrivial technical issues associated with actually adding a constexpr for feature to the language. The visitor pattern / the planned reflection features may end up being less of a headache IMO… who knows.


Let me give another example I just thought of that shows the difficulty involved.

In normal C++, the compiler knows the static type of every variable on the stack, and so it can compute the layout of the stack frame for that function.

You can be sure that the address of a local variable won’t change while the function is executing. For instance,

std::array<int, 3> a{{1,2,3}};
for (int i = 0; i < 3; ++i) {
    auto x = a[i];
    int y = 15;
    std::cout << &y << std::endl;
}

In this code, y is a local variable in the body of a for loop. It has a well-defined address throughout this function, and the address printed by the compiler will be the same each time.

What should be the behavior of similar code with constexpr for?

std::tuple<int, long double, std::string> a{};
for (int i = 0; i < 3; ++i) {
    auto x = std::get<i>(a);
    int y = 15;
    std::cout << &y << std::endl;
}

The point is that the type of x is deduced differently in each pass through the loop — since it has a different type, it may have different size and alignment on the stack. Since y comes after it on the stack, that means that y might change its address on different runs of the loop — right?

What should be the behavior if a pointer to y is taken in one pass through the loop, and then dereferenced in a later pass? Should it be undefined behavior, even though it would probably be legal in the similar “no-constexpr for” code with std::array showed above?

Should the address of y not be allowed to change? Should the compiler have to pad the address of y so that the largest of the types in the tuple can be accommodated before y? Does that mean that the compiler can’t simply unroll the loops and start generating code, but must unroll every instance of the loop before-hand, then collect all of the type information from each of the N instantiations and then find a satisfactory layout?

I think you are better off just using a pack expansion, it’s a lot more clear how it is supposed to be implemented by the compiler, and how efficient it’s going to be at compile and run time.

Leave a Comment