As part of the upcoming Pro .NET Performance book, there’s quite a bit of research we need to do on various facets of CLR internals. During my research for the Type Internals chapter I discovered a change in CLR object layout and virtual method dispatch as of CLR 4.0 – possibly not the most exciting of changes but one that invalidates most of the existing material on this subject, such as the popular JIT and Run article or theAdvanced .NET Debugging book.
First, a quick overview of how reference type instances are laid out on the heap. Suppose that we have anEmployee class with two instance fields, _name and _id, as well as a virtual method called Work. On a 32-bit managed heap, an Employee instance occupies 16 bytes, and has the following layout (each cell = 4 bytes):
Object Header Word (Sync Block Index) Method Table Pointer The _name field The _id field
The method table pointer points to Employee’s method table, which contains, among other things, code addresses for Employee’s methods. On CLR 2.0, the method table has roughly the following layout:
Flags, Size, EEClass, Module Ptr, etc. Interface Map Pointer +0x28 Object.ToString +0x2c Object.Equals +0x30 Object.GetHashCode +0x34 Object.Finalize +0x38 Employee.Work Employee..ctor Interface MTs implemented by Employee
The code addresses for Employee’s virtual methods (including those inherited and possibly overridden from any base classes) allow virtual method dispatch to proceed as follows (assuming that the object reference is in ECX):
mov eax, dword ptr [ecx] call dword ptr [eax+38]
The key here is that the order of methods in the method table and the offset of overridden methods from the beginning of the method table remains constant in all derived classes. For example, suppose that Managerderives from Employee and overrides the Work method. The method table for Manager would have the following layout:
Flags, Size, EEClass, Module Ptr, etc. Interface Map Pointer +0x28 Object.ToString +0x2c Object.Equals +0x30 Object.GetHashCode +0x34 Object.Finalize +0x38 Manager.Work Manager..ctor Interfaces implemented by Manager
If you inspect carefully the invocation sequence for Employee.Work, it turns out that the same instructions can be used to invoke Manager.Work – indeed, the caller’s instruction sequence should not depend on whetherManager overrides the Work method, and whether the referenced instance is of type Employee or of typeManager. This is the key to polymorphism.
However, as of CLR 4.0, the method table layout has changed. Several fields have been moved around or removed completely. Specifically, the invariant offset where virtual methods are laid out is no longer constant, because the list of interfaces implemented by the type precedes them (and can vary in derived classes). For example, this is a possible layout for Employee’s method table on CLR 4.0 – assuming that it implements three interfaces:
Flags, Size, EEClass, Module Ptr, etc. +0x24 Pointer to Interface List +0x28 Pointer to Methods More Miscellanea Interfaces implemented by Employee +0x40 Object.ToString +0x44 Object.Equals +0x48 Object.GetHashCode +0x4c Object.Finalize +0x50 Employee.Work
Fortunately, the “Pointer to Methods” field is at a constant offset from the beginning of the method table, and always points to the first entry in the code address list, which is the code address for Object.ToString. The offset of any virtual methods from that address is constant in all derived classes. In other words, the JIT can use a slightly longer method invocation sequence to call virtual methods in CLR 4.0:
mov eax, dword ptr [ecx] mov eax, dword ptr [eax+28] call dword ptr [eax+10]
In Manager’s method table, the Work method may have a different offset from the beginning of the method table because Manager is free to implement additional interfaces, but the “Pointer to Methods” field is at the same offset and accommodates for this difference:
Flags, Size, EEClass, Module Ptr, etc. +0x24 Pointer to Interface List +0x28 Pointer to Methods More Miscellanea Interfaces implemented by Manager +0x44 Object.ToString +0x48 Object.Equals +0x4c Object.GetHashCode +0x50 Object.Finalize +0x54 Manager.Work
This allows the same form of dispatch to work for Manager objects as well.
Needless to say, relying on the details shown in this blog post would be as futile as relying on any prior material on this subject. Internal details such as object layout and code generated by the JIT are subject to change at any time between CLR releases.