Generic methods in .NET cannot have their return types inferred. Why?

The general principle here is that type information flows only “one way”, from the inside to the outside of an expression. The example you give is extremely simple. Suppose we wanted to have type information flow “both ways” when doing type inference on a method R G<A, R>(A a), and consider some of the crazy scenarios that creates:

N(G(5))

Suppose there are ten different overloads of N, each with a different argument type. Should we make ten different inferences for R? If we did, should we somehow pick the “best” one?

double x = b ? G(5) : 123;

What should the return type of G be inferred to be? Int, because the other half of the conditional expression is int? Or double, because ultimately this thing is going to be assigned to double? Now perhaps you begin to see how this goes; if you’re going to say that you reason from outside to inside, how far out do you go? There could be many steps along the way. See what happens when we start to combine these:

N(b ? G(5) : 123)

Now what do we do? We have ten overloads of N to choose from. Do we say that R is int? It could be int or any type that int is implicitly convertible to. But of those types, which ones are implicitly convertible to an argument type of N? Do we write ourselves a little prolog program and ask the prolog engine to solve what are all the possible return types that R could be in order to satisfy each of the possible overloads on N, and then somehow pick the best one?

(I’m not kidding; there are languages that essentially do write a little prolog program and then use a logic engine to work out what the types of everything are. F# for example, does way more complex type inference than C# does. Haskell’s type system is actually Turing Complete; you can encode arbitrarily complex problems in the type system and ask the compiler to solve them. As we’ll see later, the same is true of overload resolution in C# – you cannot encode the Halting Problem in the C# type system like you can in Haskell but you can encode NP-HARD problems into overload resolution problems.) (See below)

This is still a very simple expression. Suppose you had something like

N(N(b ? G(5) * G("hello") : 123));

Now we have to solve this problem multiple times for G, and possibly for N as well, and we have to solve them in combination. We have five overload resolution problems to solve and all of them, to be fair, should be considering both their arguments and their context type. If there are ten possibilities for N then there are potentially a hundred possibilities to consider for N(N(…)) and a thousand for N(N(N(…))) and very quickly you would have us solving problems that easily had billions of possible combinations and made the compiler very slow.

This is why we have the rule that type information only flows one way. It prevents these sorts of chicken and egg problems, where you are trying to both determine the outer type from the inner type, and determine the inner type from the outer type and cause a combinatorial explosion of possibilities.

Notice that type information does flow both ways for lambdas! If you say N(x=>x.Length) then sure enough, we consider all the possible overloads of N that have function or expression types in their arguments and try out all the possible types for x. And sure enough, there are situations in which you can easily make the compiler try out billions of possible combinations to find the unique combination that works. The type inference rules that make it possible to do that for generic methods are exceedingly complex and make even Jon Skeet nervous. This feature makes overload resolution NP-HARD.

Getting type information to flow both ways for lambdas so that generic overload resolution works correctly and efficiently took me about a year. It is such a complex feature that we only wanted to take it on if we absolutely positively would have an amazing return on that investment. Making LINQ work was worth it. But there is no corresponding feature like LINQ that justifies the immense expense of making this work in general.

UPDATE: It turns out that you can encode arbitrarily difficult problems in the C# type system. C# has nominal generic subtyping with generic contravariance, and it has been shown that you can build a Turing Machine out of generic type definitions and force the compiler to execute the machine, possibly going into infinite loops. At the time I wrote this answer the undecidability of such type systems was an open question. See https://stackoverflow.com/a/23968075/88656 for details.

More Related Contents:

Leave a Comment Cancel reply