Over a million developers have joined DZone.
Platinum Partner

Traversing the GC Heap with ClrMd

The Performance Zone is brought to you in partnership with New Relic. New Relic APM provides constant monitoring of your apps so you don't have to.

ClrMd is a newly released diagnostic library that wraps the CLR’s undocumented data access interfaces (a.k.a. “DAC”) in a friendly managed API. The underlying interfaces are what debugger extensions like SOS and SOSEX use to implement various diagnostic features, including enumerating the managed heap, detecting deadlocks, inspecting object contents, and dumping type/method information.

Given my personal and professional interest in debugging tools and techniques, ClrMd is an incredible tool – I can now implement my own diagnostic features without relying on undocumented interfaces or parsing text output from debugging extensions (which also requires going through a debugger in another process).

To pique your interest, I whipped together a quick sample illustrating how commands like !DumpHeap, !DumpObj, and !GCRoot can be implemented using ClrMd. This is a sample, so there is clearly room for optimization and the code could be cleaner, but the ability to cram so much functionality in 200 lines of C# code is nothing short of overwhelming.

Without further ado, here’s some output (slightly formatted for clarity):

$ GcRoot.exe d:\temp\leak.dmp
> dumpobjects Schedule
31cae98 MemoryLeak.Schedule
31cd5f0 MemoryLeak.Schedule
31cfd48 MemoryLeak.Schedule
31d24a0 MemoryLeak.Schedule
31d4bf8 MemoryLeak.Schedule
31d7350 MemoryLeak.Schedule
31d9aa8 MemoryLeak.Schedule
31dc200 MemoryLeak.Schedule
31de958 MemoryLeak.Schedule
31e10b0 MemoryLeak.Schedule
31e3808 MemoryLeak.Schedule
31e5f60 MemoryLeak.Schedule
31e86b8 MemoryLeak.Schedule
> dumpobject 31d9aa8
System.Byte[] _data = 31d9ac0
> gcroot 31d9aa8
READY FOR FINALIZATION finalization handle(0)
  --> MemoryLeak.Employee(31d9a90)
  --> MemoryLeak.Schedule(31d9aa8)
> q

And now for some code snippets. First, we must initialize the main ClrMd objects by loading the dump (or attaching to a live process), enumerating the CLR versions in that dump (or process), and making sure the DAC DLL is accessible:

DataTarget target = DataTarget.LoadCrashDump(args[0]);
string dacLocation = target.ClrVersions[0].TryGetDacLocation();
if (string.IsNullOrEmpty(dacLocation))
    Console.WriteLine("*** Cannot find DAC location");
ClrRuntime runtime = target.CreateRuntime(dacLocation);
ClrHeap heap = runtime.GetHeap();
if (!heap.CanWalkHeap)
    Console.WriteLine("*** Cannot walk the heap");

Next, enumerating all objects in the heap that have a certain type:

foreach (ulong objPtr in heap.EnumerateObjects())
    ClrType type = heap.GetObjectType(objPtr);
    if (type.Name.Contains(typeName))
        Console.WriteLine("{0:x}\t{1}", objPtr, type.Name);

Displaying object fields (only instance fields, not statics):

ClrType type = heap.GetObjectType(objPtr);
foreach (ClrInstanceField field in type.Fields)
    string fieldType = field.Type == null ? "<TYPE>" : field.Type.Name;
    if (field.IsPrimitive() && field.HasSimpleValue)
        Console.WriteLine("{2} {0} = {1}", field.Name, field.GetFieldValue(objPtr), fieldType);
    else if (field.IsObjectReference() && field.HasSimpleValue)
        Console.WriteLine("{2} {0} = {1:x}", field.Name, field.GetFieldValue(objPtr), fieldType);

Traversing roots is a little bit more complicated. It all starts with ClrHeap.EnumerateRoots, but the ClrType.EnumerateReferencesOfObject method is key. It allows you to recursively traverse the heap until you find the object in question. (Along the way, you must keep track of objects that have already been visited so you don’t get yourself in an infinite loop.)

The gist is the following recursion, with some of the code removed for clarity:

private static void DisplayRefChainIfReachedObject(ulong objPtr, ClrRoot root, Stack<ulong> refChain, HashSet<ulong> visited)
    ulong currentObj = refChain.Peek();
    if (visited.Contains(currentObj)) return;
    if (currentObj == objPtr)
        //Display the root chain – omitted for clarity
    ClrType type = heap.GetObjectType(currentObj);
    type.EnumerateRefsOfObject(currentObj, (innerObj, fieldOffset) =>
        DisplayRefChainIfReachedObject(objPtr, root, refChain, visited);

To summarize: ClrMd is a new managed diagnostic library that wraps the functionality previously exposed only from debugging extensions. It opens a wide range of possibilities for automatic diagnostics of managed processes and dump files.

The Performance Zone is brought to you in partnership with New Relic. New Relic’s SaaS-based Application Performance Monitoring helps you build, deploy, and maintain great web software.


Published at DZone with permission of Sasha Goldshtein , DZone MVB .

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}