Reflection support in C

Reflection in general is a means for a program to analyze the structure of some code.
This analysis is used to change the effective behavior of the code.

Reflection as analysis is generally very weak; usually it can only provide access to function and field names. This weakness comes from the language implementers essentially not wanting to make the full source code available at runtime, along with the appropriate analysis routines to extract what one wants from the source code.

Another approach is tackle program analysis head on, by using a strong program analysis tool, e.g., one that can parse the source text exactly the way the compiler does it.
(Often people propose to abuse the compiler itself to do this, but that usually doesn’t work; the compiler machinery wants to be a compiler and it is darn hard to bend it to other purposes).

What is needed is a tool that:

  • Parses language source text
  • Builds abstract syntax trees representing every detail of the program.
    (It is helpful if the ASTs retain comments and other details of the source
    code layout such as column numbers, literal radix values, etc.)
  • Builds symbol tables showing the scope and meaning of every identifier
  • Can extract control flows from functions
  • Can extact data flow from the code
  • Can construct a call graph for the system
  • Can determine what each pointer points-to
  • Enables the construction of custom analyzers using the above facts
  • Can transform the code according to such custom analyses
    (usually by revising the ASTs that represent the parsed code)
  • Can regenerate source text (including layout and comments) from
    the revised ASTs.

Using such machinery, one implements analysis at whatever level of detail is needed, and then transforms the code to achieve the effect that runtime reflection would accomplish.
There are several major benefits:

  • The detail level or amount of analysis is a matter of ambition (e.g., it isn’t
    limited by what runtime reflection can only do)
  • There isn’t any runtime overhead to achieve the reflected change in behavior
  • The machinery involved can be general and applied across many languages, rather
    than be limited to what a specific language implementation provides.
  • This is compatible with the C/C++ idea that you don’t pay for what you don’t use.
    If you don’t need reflection, you don’t need this machinery. And your language
    doesn’t need to have the intellectual baggage of weak reflection built in.

See our DMS Software Reengineering Toolkit for a system that can do all of the above for C, Java, and COBOL, and most of it for C++.

[EDIT August 2017: Now handles C11 and C++2017]

Leave a Comment