Equivalent of Class Loaders in .NET

The answer is yes, but the solution is a little tricky.

The System.Reflection.Emit namespace defines types that allows assemblies to be generated dynamically. They also allow the generated assemblies to be defined incrementally. In other words it is possible to add types to the dynamic assembly, execute the generated code, and then latter add more types to the assembly.

The System.AppDomain class also defines an AssemblyResolve event that fires whenever the framework fails to load an assembly. By adding a handler for that event, it is possible to define a single “runtime” assembly into which all “constructed” types are placed. The code generated by the compiler that uses a constructed type would refer to a type in the runtime assembly. Because the runtime assembly doesn’t actually exist on disk, the AssemblyResolve event would be fired the first time the compiled code tried to access a constructed type. The handle for the event would then generate the dynamic assembly and return it to the CLR.

Unfortunately, there are a few tricky points to getting this to work. The first problem is ensuring that the event handler will always be installed before the compiled code is run. With a console application this is easy. The code to hookup the event handler can just be added to the Main method before the other code runs. For class libraries, however, there is no main method. A dll may be loaded as part of an application written in another language, so it’s not really possible to assume there is always a main method available to hookup the event handler code.

The second problem is ensuring that the referenced types all get inserted into the dynamic assembly before any code that references them is used. The System.AppDomain class also defines a TypeResolve event that is executed whenever the CLR is unable to resolve a type in a dynamic assembly. It gives the event handler the opportunity to define the type inside the dynamic assembly before the code that uses it runs. However, that event will not work in this case. The CLR will not fire the event for assemblies that are “statically referenced” by other assemblies, even if the referenced assembly is defined dynamically. This means that we need a way to run code before any other code in the compiled assembly runs and have it dynamically inject the types it needs into the runtime assembly if they have not already been defined. Otherwise when the CLR tried to load those types it will notice that the dynamic assembly does not contain the types they need and will throw a type load exception.

Fortunately, the CLR offers a solution to both problems: Module Initializers. A module initializer is the equivalent of a “static class constructor”, except that it initializes an entire module, not just a single class. Baiscally, the CLR will:

  1. Run the module constructor before any types inside the module are accessed.
  2. Guarantee that only those types directly accessed by the module constructor will be loaded while it is executing
  3. Not allow code outside the module to access any of it’s members until after the constructor has finished.

It does this for all assemblies, including both class libraries and executables, and for EXEs will run the module constructor before executing the Main method.

See this blog post for more information about constructors.

In any case, a complete solution to my problem requires several pieces:

  1. The following class definition, defined inside a “language runtime dll”, that is referenced by all assemblies produced by the compiler (this is C# code).

    using System;
    using System.Collections.Generic;
    using System.Reflection;
    using System.Reflection.Emit;
    
    namespace SharedLib
    {
        public class Loader
        {
            private Loader(ModuleBuilder dynamicModule)
            {
                m_dynamicModule = dynamicModule;
                m_definedTypes = new HashSet<string>();
            }
    
            private static readonly Loader m_instance;
            private readonly ModuleBuilder m_dynamicModule;
            private readonly HashSet<string> m_definedTypes;
    
            static Loader()
            {
                var name = new AssemblyName("$Runtime");
                var assemblyBuilder = AppDomain.CurrentDomain.DefineDynamicAssembly(name, AssemblyBuilderAccess.Run);
                var module = assemblyBuilder.DefineDynamicModule("$Runtime");
                m_instance = new Loader(module);
                AppDomain.CurrentDomain.AssemblyResolve += new ResolveEventHandler(CurrentDomain_AssemblyResolve);
            }
    
            static Assembly CurrentDomain_AssemblyResolve(object sender, ResolveEventArgs args)
            {
                if (args.Name == Instance.m_dynamicModule.Assembly.FullName)
                {
                    return Instance.m_dynamicModule.Assembly;
                }
                else
                {
                    return null;
                }
            }
    
            public static Loader Instance
            {
                get
                {
                    return m_instance;
                }
            }
    
            public bool IsDefined(string name)
            {
                return m_definedTypes.Contains(name);
            }
    
            public TypeBuilder DefineType(string name)
            {
                //in a real system we would not expose the type builder.
                //instead a AST for the type would be passed in, and we would just create it.
                var type = m_dynamicModule.DefineType(name, TypeAttributes.Public);
                m_definedTypes.Add(name);
                return type;
            }
        }
    }
    

    The class defines a singleton that holds a reference to the dynamic assembly that the constructed types will be created in. It also holds a “hash set” that stores the set of types that have already been dynamically generated, and finally defines a member that can be used to define the type. This example just returns a System.Reflection.Emit.TypeBuilder instance that can then be used to define the class being generated. In a real system, the method would probably take in an AST representation of the class, and just do the generation it’s self.

  2. Compiled assemblies that emit the following two references (shown in ILASM syntax):

    .assembly extern $Runtime
    {
        .ver 0:0:0:0
    }
    .assembly extern SharedLib
    {
        .ver 1:0:0:0
    }
    

    Here “SharedLib” is the Language’s predefined runtime library that includes the “Loader” class defined above and “$Runtime” is the dynamic runtime assembly that the consructed types will be inserted into.

  3. A “module constructor” inside every assembly compiled in the language.

    As far as I know, there are no .NET languages that allow Module Constructors to be defined in source. The C++ /CLI compiler is the only compiler I know of that generates them. In IL, they look like this, defined directly in the module and not inside any type definitions:

    .method privatescope specialname rtspecialname static 
            void  .cctor() cil managed
    {
        //generate any constructed types dynamically here...
    }
    

    For me, It’s not a problem that I have to write custom IL to get this to work. I’m writing a compiler, so code generation is not an issue.

    In the case of an assembly that used the types tuple<i as int, j as int> and tuple<x as double, y as double, z as double> the module constructor would need to generate types like the following (here in C# syntax):

    class Tuple_i_j<T, R>
    {
        public T i;
        public R j;
    }
    
    class Tuple_x_y_z<T, R, S>
    {
        public T x;
        public R y;
        public S z;
    }
    

    The tuple classes are generated as generic types to get around accessibility issues. That would allow code in the compiled assembly to use tuple<x as Foo>, where Foo was some non-public type.

    The body of the module constructor that did this (here only showing one type, and written in C# syntax) would look like this:

    var loader = SharedLib.Loader.Instance;
    lock (loader)
    {
        if (! loader.IsDefined("$Tuple_i_j"))
        {
            //create the type.
            var Tuple_i_j = loader.DefineType("$Tuple_i_j");
            //define the generic parameters <T,R>
           var genericParams = Tuple_i_j.DefineGenericParameters("T", "R");
           var T = genericParams[0];
           var R = genericParams[1];
           //define the field i
           var fieldX = Tuple_i_j.DefineField("i", T, FieldAttributes.Public);
           //define the field j
           var fieldY = Tuple_i_j.DefineField("j", R, FieldAttributes.Public);
           //create the default constructor.
           var constructor= Tuple_i_j.DefineDefaultConstructor(MethodAttributes.Public);
    
           //"close" the type so that it can be used by executing code.
           Tuple_i_j.CreateType();
        }
    }
    

So in any case, this was the mechanism I was able to come up with to enable the rough equivalent of custom class loaders in the CLR.

Does anyone know of an easier way to do this?

Leave a Comment