What is name mangling, and how does it work?

In the programming language of your choice, if an identifier is exported from a separately compiled unit, it needs a name by which it is known at link time. Name mangling solves the problem of overloaded identifiers in programming languages. (An identifier is “overloaded” if the same name is used in more than one context or with more than one meaning.)

Some examples:

  • In C++, function or method get may be overloaded at multiple types.

  • In Ada or Modula-3, function get may appear in multiple modules.

Multiple types and multiple modules cover the usual contexts.

Typical strategies:

  • Map each type to a string and use the combined high-level identifier and “type string” as the link-time name. Common in C++ (especially easy since overloading is permitted only for functions/methods and only on argument types) and Ada (where you can overload result types as well).

  • If an identifier is used in more than one module or namespace, join the name of the module with the name of the identifier, e.g., List_get instead of List.get.

Depending on what characters are legal in link-time names, you may have to do additional mangling; for example, it may be necessary to use the underscore as an ‘escape’ character, so you can distinguish

  • List_my.get -> List__my_get

from

  • List.my_get -> List_my__get

(Admittedly this example is reaching, but as a compiler writer, I have to guarantee that distinct identifiers in the source code map to distinct link-time names. That’s the whole reason and purpose for name mangling.)

Leave a Comment