Why does OCaml sometimes require eta expansion?

What you got here is the value polymorphism restriction of ML language family.

The aim of the restriction is to settle down let-polymorphism and side effects together. For example, in the following definition:

let r = ref None

r cannot have a polymorphic type 'a option ref. Otherwise:

let () =
  r := Some 1;       (* use r as int option ref *)
  match !r with
  | Some s -> print_string s  (* this time, use r as a different type, string option ref *)
  | None -> ()

is wrongly type-checked as valid, but it crashes, since the reference cell r is used for these two incompatible types.

To fix this issue many researches were done in 80’s, and the value polymoprhism is one of them. It restricts polymorphism only to let bindings whose definition form is “non-expansive”. Eta expanded form is non expansive therefore your eta expanded version of myFun has a polymorphic type, but not for eta reduced one. (More precisely speaking, OCaml uses a relaxed version of this value polymorphism, but the story is basically the same.)

When the definition of let binding is expansive there is no polymorphism introduced therefore the type variables are left non-generalized. These types are printed as '_a in the toplevel, and their intuitive meaning is: they must be instantiated to some concrete type later:

# let r = ref None                           (* expansive *)
val r : '_a option ref = {contents = None}   (* no polymorphism is allowed *)
                                             (* type checker does not reject this,
                                                hoping '_a is instantiated later. *)

We can fix the type '_a after the definition:

# r := Some 1;;                              (* fixing '_a to int *)
- : unit = ()
# r;;
- : int option ref = {contents = Some 1}     (* Now '_a is unified with int *)

Once fixed, you cannot change the type, which prevents the crash above.

This typing delay is permitted until the end of the typing of the compilation unit. The toplevel is a unit which never ends and therefore you can have values with '_a type variables anywhere of the session. But in the separated compilation, '_a variables must be instantiated to some type without type variables till the end of ml file:

(* test.ml *)
let r = ref None (* r : '_a option ref *)
(* end of test.ml. Typing fails due to the non generalizable type variable remains. *)

This is what is happening with your myFun function with the compiler.

AFAIK, there is no perfect solution to the problem of polymorphism and side effects. Like other solutions, the value polymorphism restriction has its own drawback: if you want to have a polymorphic value, you must make the definition in non-expansive: you must eta-expand myFun. This is a bit lousy but is considered acceptable.

You can read some other answers:

Leave a Comment