What is the VTT for a class?

The page “Notes on Multiple Inheritance in GCC C++ Compiler v4.0.1” is offline now, and http://web.archive.org didn’t archive it. So, I have found a copy of the text at tinydrblog which is archived at the web archive.

There is full text of the original Notes, published online as part of “Doctoral Programming Language Seminar: GCC Internals” (fall 2005) by graduate Morgan Deters “in the Distributed Object Computing Laboratory in the Computer Science department at Washington University in St Louis.”
His (archived) homepage:

THIS IS THE TEXT by Morgan Deters and NOT CC-licensed.

Morgan Deters webpages:

PART1:

The Basics: Single Inheritance

As we discussed in class, single inheritance leads to an object layout with base class data laid out before derived class data. So if classes A and B are defined thusly:

class A {
public:
  int a;

};

class B : public A {
public:
  int b;
};

then objects of type B are laid out like this (where “b” is a pointer to such an object):

b --> +-----------+
      |     a     |
      +-----------+
      |     b     |
      +-----------+

If you have virtual methods:

class A {
public:
  int a;
  virtual void v();
};

class B : public A {
public:
  int b;
};

then you’ll have a vtable pointer as well:

                           +-----------------------+
                           |     0 (top_offset)    |
                           +-----------------------+
b --> +----------+         | ptr to typeinfo for B |
      |  vtable  |-------> +-----------------------+
      +----------+         |         A::v()        |
      |     a    |         +-----------------------+
      +----------+
      |     b    |
      +----------+

that is, top_offset and the typeinfo pointer live above the location to which the vtable pointer points.

Simple Multiple Inheritance

Now consider multiple inheritance:

class A {
public:
  int a;
  virtual void v();
};

class B {
public:
  int b;
  virtual void w();
};

class C : public A, public B {
public:
  int c;
};

In this case, objects of type C are laid out like this:

                           +-----------------------+
                           |     0 (top_offset)    |
                           +-----------------------+
c --> +----------+         | ptr to typeinfo for C |
      |  vtable  |-------> +-----------------------+
      +----------+         |         A::v()        |
      |     a    |         +-----------------------+
      +----------+         |    -8 (top_offset)    |
      |  vtable  |---+     +-----------------------+
      +----------+   |     | ptr to typeinfo for C |
      |     b    |   +---> +-----------------------+
      +----------+         |         B::w()        |
      |     c    |         +-----------------------+
      +----------+

…but why? Why two vtables in one? Well, think about type substitution. If I have a pointer-to-C, I can pass it to a function that expects a pointer-to-A or to a function that expects a pointer-to-B. If a function expects a pointer-to-A and I want to pass it the value of my variable c (of type pointer-to-C), I’m already set. Calls to A::v() can be made through the(first) vtable, and the called function can access the member a through the pointer I pass in the same way as it can through any pointer-to-A.

However, if I pass the value of my pointer variable c to a function that expects a pointer-to-B, we also need a subobject of type B in our C to refer it to. This is why we have the second vtable pointer. We can pass the pointer value(c + 8 bytes) to the function that expects a pointer-to-B, and it’s all set: it can make calls to B::w() through the (second) vtable pointer, and access the member b through the pointer we pass in the same way as it can through any pointer-to-B.

Note that this “pointer-correction” needs to occur for called methods too. Class C inherits B::w() in this case. When w() is called on through a pointer-to-C, the pointer (which becomes the this pointer inside of w() needs to be adjusted. This is often called this pointer adjustment.

In some cases, the compiler will generate a thunk to fix up the address. Consider the same code as above but this time C overrides B‘s member function w():

class A {
public:
  int a;
  virtual void v();
};

class B {
public:
  int b;
  virtual void w();
};

class C : public A, public B {
public:
  int c;
  void w();
};

C‘s object layout and vtable now look like this:

                           +-----------------------+
                           |     0 (top_offset)    |
                           +-----------------------+
c --> +----------+         | ptr to typeinfo for C |
      |  vtable  |-------> +-----------------------+
      +----------+         |         A::v()        |
      |     a    |         +-----------------------+
      +----------+         |         C::w()        |
      |  vtable  |---+     +-----------------------+
      +----------+   |     |    -8 (top_offset)    |
      |     b    |   |     +-----------------------+
      +----------+   |     | ptr to typeinfo for C |
      |     c    |   +---> +-----------------------+
      +----------+         |    thunk to C::w()    |
                           +-----------------------+

Now, when w() is called on an instance of C through a pointer-to-B, the thunk is called. What does the thunk do? Let’s disassemble it (here, with gdb):

0x0804860c <_ZThn8_N1C1wEv+0>:  addl   $0xfffffff8,0x4(%esp)
0x08048611 <_ZThn8_N1C1wEv+5>:  jmp    0x804853c <_ZN1C1wEv>

So it merely adjusts the this pointer and jumps to C::w(). All is well.

But doesn’t the above mean that B‘s vtable always points to this C::w() thunk? I mean, if we have a pointer-to-B that is legitimately a B (not a C), we don’t want to invoke the thunk, right?

Right. The above embedded vtable for B in C is special to the B-in-C case. B’s regular vtable is normal and points to B::w() directly.

The Diamond: Multiple Copies of Base Classes (non-virtual inheritance)

Okay. Now to tackle the really hard stuff. Recall the usual problem of multiple copies of base classes when forming an inheritance diamond:

class A {
public:
  int a;
  virtual void v();
};

class B : public A {
public:
  int b;
  virtual void w();
};

class C : public A {
public:
  int c;
  virtual void x();
};

class D : public B, public C {
public:
  int d;
  virtual void y();
};

Note that D inherits from both B and C, and B and C both inherit from A. This means that D has two copies of A in it. The object layout and vtable embedding is what we would expect from the previous sections:

                           +-----------------------+
                           |     0 (top_offset)    |
                           +-----------------------+
d --> +----------+         | ptr to typeinfo for D |
      |  vtable  |-------> +-----------------------+
      +----------+         |         A::v()        |
      |     a    |         +-----------------------+
      +----------+         |         B::w()        |
      |     b    |         +-----------------------+
      +----------+         |         D::y()        |
      |  vtable  |---+     +-----------------------+
      +----------+   |     |   -12 (top_offset)    |
      |     a    |   |     +-----------------------+
      +----------+   |     | ptr to typeinfo for D |
      |     c    |   +---> +-----------------------+
      +----------+         |         A::v()        |
      |     d    |         +-----------------------+
      +----------+         |         C::x()        |
                           +-----------------------+

Of course, we expect A‘s data (the member a) to exist twice in D‘s object layout (and it is), and we expect A‘s virtual member functions to be represented twice in the vtable (and A::v() is indeed there). Okay, nothing new here.

The Diamond: Single Copies of Virtual Bases

But what if we apply virtual inheritance? C++ virtual inheritance allows us to specify a diamond hierarchy but be guaranteed only one copy of virtually inherited bases. So let’s write our code this way:

class A {
public:
  int a;
  virtual void v();
};

class B : public virtual A {
public:
  int b;
  virtual void w();
};

class C : public virtual A {
public:
  int c;
  virtual void x();
};

class D : public B, public C {
public:
  int d;
  virtual void y();
};

All of a sudden things get a lot more complicated. If we can only have one copy of A in our representation of D, then we can no longer get away with our “trick” of embedding a C in a D (and embedding a vtable for the C part of D in D‘s vtable). But how can we handle the usual type substitution if we can’t do this?

Let’s try to diagram the layout:

                                   +-----------------------+
                                   |   20 (vbase_offset)   |
                                   +-----------------------+
                                   |     0 (top_offset)    |
                                   +-----------------------+
                                   | ptr to typeinfo for D |
                      +----------> +-----------------------+
d --> +----------+    |            |         B::w()        |
      |  vtable  |----+            +-----------------------+
      +----------+                 |         D::y()        |
      |     b    |                 +-----------------------+
      +----------+                 |   12 (vbase_offset)   |
      |  vtable  |---------+       +-----------------------+
      +----------+         |       |    -8 (top_offset)    |
      |     c    |         |       +-----------------------+
      +----------+         |       | ptr to typeinfo for D |
      |     d    |         +-----> +-----------------------+
      +----------+                 |         C::x()        |
      |  vtable  |----+            +-----------------------+
      +----------+    |            |    0 (vbase_offset)   |
      |     a    |    |            +-----------------------+
      +----------+    |            |   -20 (top_offset)    |
                      |            +-----------------------+
                      |            | ptr to typeinfo for D |
                      +----------> +-----------------------+
                                   |         A::v()        |
                                   +-----------------------+

Okay. So you see that A is now embedded in D in essentially the same way that other bases are. But it’s embedded in D rather than inits directly-derived classes.

Leave a Comment