DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • How the Go Runtime Preempts Goroutines for Efficient Concurrency
  • Bitwise Operators in Go
  • Leveraging Golang for Modern ETL Pipelines
  • Beginner's Guide to Socket Programming in Go

Trending

  • *You* Can Shape Trend Reports: Join DZone's Software Supply Chain Security Research
  • How to Build Local LLM RAG Apps With Ollama, DeepSeek-R1, and SingleStore
  • Scalable, Resilient Data Orchestration: The Power of Intelligent Systems
  • Stateless vs Stateful Stream Processing With Kafka Streams and Apache Flink
  1. DZone
  2. Coding
  3. Languages
  4. Golang: Is It a Memory Leak?

Golang: Is It a Memory Leak?

Figuring out where memory went in Go is tricky. This article explores lessons learned and how to pinpoint the exact allocations in a 3rd party library.

By 
Alexander Komyagin user avatar
Alexander Komyagin
·
Nov. 26, 24 · Tutorial
Likes (7)
Comment
Save
Tweet
Share
2.8K Views

Join the DZone community and get the full member experience.

Join For Free

This is an illustrated walkthrough for how to find the cause of high memory usage in Go programs with standard tools, based on our recent experience. There's plenty of information online, but it still took us considerable time and effort to find what works and what doesn't. Hopefully, this post will save time for everyone else.

Golang leak graphic

Context

At Adiom, we're building an open-source tool for online database migration and real-time replication, called dsync. Our primary focus is on ease of use, robustness, and speed. One of the areas that we're focusing on heavily is data integrity, as Enterprise customers want to ensure data consistency between the source and the destination and know the differences, if there are any.

To that end, we have recently come across Merkle Search Trees (MST) that have some interesting properties. It may be worth a whole different post, but for those interested now, here's the link to the original paper. In theory, MSTs should allow us to efficiently and cheaply diff the source and destination datasets. Naturally, we wanted to experiment with that, so we added a dsync verify <source> <destination> command as a POC leveraging an open-source reference implementation from GitHub: jrhy/mast. It worked like a charm on "toy" datasets from 100 to 1000 records. But in one of our larger tests, to our surprise, the binary consumed 3.5GB RAM for a million records on both sides. While not a whole lot in absolute numbers, this was orders of magnitude higher than what we were expecting — maybe 10s or 100s of megabytes — because we only stored the record ID (8 bytes) and a hash (8 bytes) for each record.

Our first thought was a memory "leak." Similar to Java, Go manages memory for you and has a garbage collector based on open object references. There's no concept of leaks as such, but rather unexpected or undesirable accumulation of objects that the code kept references to. Unlike Java though, Go doesn't run as byte code on top of a JVM, so it doesn't know which specific objects are accumulating in memory.

Part 1: Run Garbage Collector

In Go, you can forcefully trigger garbage collection hoping that the memory is consumed by objects that are no longer needed: 

Go
 
func main() {
	// App code

	log.Println("Triggering garbage collection...")
	runtime.GC()
	printMemUsage()
  
  	// More app code
}

func printMemUsage() {
	var m runtime.MemStats
	runtime.ReadMemStats(&m)
	log.Printf("Alloc = %v MiB", bToMb(m.Alloc))
	log.Printf("TotalAlloc = %v MiB", bToMb(m.TotalAlloc))
	log.Printf("Sys = %v MiB", bToMb(m.Sys))
	log.Printf("NumGC = %v\n", m.NumGC)
}


Unfortunately, that only freed up a few hundred MBs for us, so it wasn't very effective.

Part 2: Examine the Memory

In theory, we should be able to do the coredump of the process (dump the raw memory) and then examine it directly. Here are some relevant resources for that:

  • Obtaining the core dump from a running process: Go Wiki: CoreDumpDebugging
  • Specific instructions for Mac OS where gcore doesn't work right away: "How to Dump a Core File on MacOS"
  • The viewcore tool to parse the coredump: GitHub: golang/debug

Unfortunately, the viewcore didn't work with our coredump, presumably because we're doing development on a Mac laptop with Darwin architecture:

Shell
 
user@MBP viewcore % ./viewcore /cores/dsync-52417-20241114T234221Z html
failed to parse core: bad magic number '[207 250 237 254]' in record at byte 0x0


If you're using Linux, you should give it a shot.

Part 3: Profiling and Debugging

We had to resort to investigative debugging instead. The most common sources of memory leaks in Go are:

  • Accumulating and not closing resources such as network connections, files, etc.
  • Global variables
  • Slices with large backing arrays
  • Lingering Go routines that don't finish

After a brief examination of the codebase, we didn't see any smoking guns. Our next step was to use pprof. Pprof lets us sample a running process and obtain a heap dump that can later be examined.

Pprof Setup

Pprof runs as a webserver in your application. Adding it is easy:

Go
 
import (
    "net/http"
    _ "net/http/pprof" // Import pprof package for profiling
)

func main() {
    go func() {
        log.Println(http.ListenAndServe("localhost:8081", nil)) // Start pprof server
    }()
    // Your application logic here
}


In our case, it's accessible on port 8081. 

When the application is running and shows excessive memory consumption, we can collect heap dumps:

Shell
 
curl -s http://localhost:8081/debug/pprof/heap > /tmp/heap1.out


I recommend collecting a few just to have a few different samples. Note that pprof actually samples information about allocations, so it's not going to be a 100% representation. By default, it's 1 sample per 512kb allocated. 

From here on out, it's really smooth sailing.

Memory Profiling

We examined inuse_space and inuse_objects with pprof that track the active space and the number of objects respectively:

Shell
 
go tool pprof --inuse_objects /tmp/heap1.out
go tool pprof --inuse_space /tmp/heap1.out


The in-use space was particularly fruitful. First, we tried to visualize the memory allocation using the web command, which opens a new web browser window.

Shell
 
user@MBP viewcore % go tool pprof --inuse_space /tmp/heap1.out
File: dsync
Type: inuse_space
Time: Nov 14, 2024 at 5:34pm (PST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) web
(pprof)


Visualization of the memory allocation using the web command

The thicker the link on the graph, the more memory went to that node (and that node is bigger, too). An alternative way to do this is to use the top5 command:

Shell
 
user@MBP viewcore % go tool pprof --inuse_space /tmp/heap1.out
File: dsync
Type: inuse_space
Time: Nov 14, 2024 at 5:34pm (PST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top5
Showing nodes accounting for 1670.40MB, 99.37% of 1680.96MB total
Dropped 43 nodes (cum <= 8.40MB)
Showing top 5 nodes out of 7
      flat  flat%   sum%        cum   cum%
 1471.89MB 87.56% 87.56%  1471.89MB 87.56%  github.com/jrhy/mast.split
      78MB  4.64% 92.20%   851.79MB 50.67%  github.com/adiom-data/dsync/internal/app.(*mastVerify).Run.func1.processMast.1
      77MB  4.58% 96.78%   826.13MB 49.15%  github.com/adiom-data/dsync/internal/app.(*mastVerify).Run.func2.processMast.1
   43.50MB  2.59% 99.37%    43.50MB  2.59%  bytes.Clone
         0     0% 99.37%  1677.92MB 99.82%  github.com/adiom-data/dsync/internal/app.(*source).ProcessSource.func2
(pprof)


These told us that the excessive memory was allocated in the mast.split function, and that it is a lot of 4.75kB objects. 

While we can't see what those objects are, we can see where in the code they were allocated using the list command:

Shell
 
user@MBP viewcore % go tool pprof --inuse_space /tmp/heap1.out
File: dsync
Type: inuse_space
Time: Nov 14, 2024 at 5:34pm (PST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top5
Showing nodes accounting for 1670.40MB, 99.37% of 1680.96MB total
Dropped 43 nodes (cum <= 8.40MB)
Showing top 5 nodes out of 7
      flat  flat%   sum%        cum   cum%
 1471.89MB 87.56% 87.56%  1471.89MB 87.56%  github.com/jrhy/mast.split
      78MB  4.64% 92.20%   851.79MB 50.67%  github.com/adiom-data/dsync/internal/app.(*mastVerify).Run.func1.processMast.1
      77MB  4.58% 96.78%   826.13MB 49.15%  github.com/adiom-data/dsync/internal/app.(*mastVerify).Run.func2.processMast.1
   43.50MB  2.59% 99.37%    43.50MB  2.59%  bytes.Clone
         0     0% 99.37%  1677.92MB 99.82%  github.com/adiom-data/dsync/internal/app.(*source).ProcessSource.func2
(pprof) list github.com/jrhy/mast.split
Total: 1.64GB
ROUTINE ======================== github.com/jrhy/mast.split in /Users/alexander/go/pkg/mod/github.com/jrhy/mast@v1.2.32/lib.go
    1.44GB     1.54GB (flat, cum) 93.57% of Total
         .          .     82:func split(ctx context.Context, node *mastNode, key interface{}, mast *Mast) (leftLink, rightLink interface{}, err error) {
         .          .     83:	var splitIndex int
         .          .     84:	for splitIndex = 0; splitIndex < len(node.Key); splitIndex++ {
         .          .     85:		var cmp int
         .          .     86:		cmp, err = mast.keyOrder(node.Key[splitIndex], key)
         .          .     87:		if err != nil {
         .          .     88:			return nil, nil, fmt.Errorf("keyCompare: %w", err)
         .          .     89:		}
         .          .     90:		if cmp == 0 {
         .          .     91:			panic("split shouldn't need to handle preservation of already-present key")
         .          .     92:		}
         .          .     93:		if cmp > 0 {
         .          .     94:			break
         .          .     95:		}
         .          .     96:	}
         .          .     97:	var tooBigLink interface{} = nil
       8MB        8MB     98:	left := mastNode{
         .          .     99:		Node{
  416.30MB   416.30MB    100:			make([]interface{}, 0, cap(node.Key)),
  458.51MB   458.51MB    101:			make([]interface{}, 0, cap(node.Value)),
  427.87MB   427.87MB    102:			make([]interface{}, 0, cap(node.Link)),
         .          .    103:		},
         .          .    104:		true, false, nil, nil,
         .          .    105:	}
         .          .    106:	left.Key = append(left.Key, node.Key[:splitIndex]...)
         .          .    107:	left.Value = append(left.Value, node.Value[:splitIndex]...)
         .          .    108:	left.Link = append(left.Link, node.Link[:splitIndex+1]...)
         .          .    109:
         .          .    110:	// repartition the left and right subtrees based on the new key
         .          .    111:	leftMaxLink := left.Link[len(left.Link)-1]
         .          .    112:	if leftMaxLink != nil {
         .          .    113:		var leftMax *mastNode
         .          .    114:		leftMax, err = mast.load(ctx, leftMaxLink)
         .          .    115:		if mast.debug {
         .          .    116:			fmt.Printf("  splitting leftMax, node with keys: %v\n", leftMax.Key)
         .          .    117:		}
         .          .    118:		if err != nil {
         .          .    119:			return nil, nil, fmt.Errorf("loading leftMax: %w", err)
         .          .    120:		}
         .    91.39MB    121:		leftMaxLink, tooBigLink, err = split(ctx, leftMax, key, mast)
         .          .    122:		if err != nil {
         .          .    123:			return nil, nil, fmt.Errorf("splitting leftMax: %w", err)
         .          .    124:		}
         .          .    125:		if mast.debug {
         .          .    126:			fmt.Printf("  splitting leftMax, node with keys: %v is done: leftMaxLink=%v, tooBigLink=%v\n", leftMax.Key, leftMaxLink, tooBigLink)
         .          .    127:		}
         .          .    128:		left.Link[len(left.Link)-1] = leftMaxLink
         .          .    129:	}
         .          .    130:	if !left.isEmpty() {
         .          .    131:		leftLink, err = mast.store(&left)
         .          .    132:		if err != nil {
         .          .    133:			return nil, nil, fmt.Errorf("store left: %w", err)
         .          .    134:		}
         .          .    135:	}
       1MB        1MB    136:	right := mastNode{
         .          .    137:		Node{
   54.24MB    54.24MB    138:			make([]interface{}, 0, cap(node.Key)),
   56.75MB    56.75MB    139:			make([]interface{}, 0, cap(node.Value)),
   49.22MB    49.22MB    140:			make([]interface{}, 0, cap(node.Link)),
         .          .    141:		},
         .          .    142:		true, false, nil, nil,
         .          .    143:	}
         .          .    144:	right.Key = append(right.Key, node.Key[splitIndex:]...)
         .          .    145:	right.Value = append(right.Value, node.Value[splitIndex:]...)
         .          .    146:	right.Link = append(right.Link, node.Link[splitIndex:]...)
         .          .    147:	right.Link[0] = tooBigLink
         .          .    148:
         .          .    149:	rightMinLink := right.Link[0]
         .          .    150:	if rightMinLink != nil {
         .          .    151:		var rightMin *mastNode
         .          .    152:		rightMin, err = mast.load(ctx, rightMinLink)
         .          .    153:		if err != nil {
         .          .    154:			return nil, nil, fmt.Errorf("load rightMin: %w", err)
         .          .    155:		}
         .          .    156:		var tooSmallLink interface{}
         .     9.54MB    157:		tooSmallLink, rightMinLink, err = split(ctx, rightMin, key, mast)
         .          .    158:		if err != nil {
         .          .    159:			return nil, nil, fmt.Errorf("split rightMin: %w", err)
         .          .    160:		}
         .          .    161:		if mast.debug {
         .          .    162:			fmt.Printf("  splitting rightMin, node with keys %v, is done: tooSmallLink=%v, rightMinLink=%v",
(pprof)


Now we could clearly see what those objects were: arrays with preallocated capacity. These store the []interface{} type, which is 2 words in memory or 16 bytes on our machine (64-bit system). It must be a large number, at least on average (4.75kB / 16 bytes ~= 300). The mystery is half-solved. We weren't sure if that capacity was being used or not. So we used delve, which is a debugger for Go.

Debugging

A simple code inspection showed that those objects are part of tree nodes. To find out what they actually looked like, we used VS Code debug mode to attach to the running process and inspect these objects:

  1. Use the configuration to attach to the running process with the following launch.json:
JSON
 
{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        
        {
            "name": "Attach to Process",
            "type": "go",
            "request": "attach",
            "mode": "local",
            "processId": <PID of the running process>
        }
    ]
}


2. Add a breakpoint in our verification loop where the tree objects are accessed.

This immediately let us see that the array length was often much less than the preallocated size (cap) that was set:

Output showing array length was often much less than the preallocated size (cap) that was set

As a probe, I created a forked implementation that had no caps set in node allocation and the memory usage immediately dropped to 600MB, which is a 6x win. 

Mystery solved! 

Golang Open source garbage collection Go (programming language) Object (computer science)

Opinions expressed by DZone contributors are their own.

Related

  • How the Go Runtime Preempts Goroutines for Efficient Concurrency
  • Bitwise Operators in Go
  • Leveraging Golang for Modern ETL Pipelines
  • Beginner's Guide to Socket Programming in Go

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!