Mutex profile

Go 1.8 introduces a new profile, the contended mutex profile, that allows you to capture a fraction of the stack traces of goroutines with contended mutexes.

You need to set the sampling fraction by calling runtime.SetMutexProfileFraction to a value above zero to enable collection.

Consider the following program:

import _ "net/http/pprof"

var mu sync.Mutex
var items = make(map[int]struct{})

runtime.SetMutexProfileFraction(5)
for i := 0; i < 1000*1000; i++ {
    go func(i int) {
        mu.Lock()
        defer mu.Unlock()

        items[i] = struct{}{}
    }(i)
}

http.ListenAndServe(":8888", nil)

Run the program, http://localhost:8888/debug/pprof/mutex will serve the mutex profile. Then, you can use go tool pprof to examine the profile.

$ go tool pprof <binary> http://localhost:8888/debug/pprof/mutex?debug=1
Fetching profile from http://localhost:8888/debug/pprof/mutex
Saved profile in /Users/jbd/pprof/pprof.mutexprofile.localhost:8888.contentions.delay.002.pb.gz
Entering interactive mode (type "help" for commands)
(pprof) list
Total: 27.15s
ROUTINE ======================== main.main.func1 in /Users/jbd/src/hello/mutexprofile/main.go
         0     27.15s (flat, cum)   100% of Total
         .          .     18:		go func() {
         .          .     19:			mu.Lock()
         .          .     20:			defer mu.Unlock()
         .          .     21:
         .          .     22:			items[i] = struct{}{}
         .     27.15s     23:		}()
         .          .     24:	}
         .          .     25:	http.ListenAndServe(":8888", nil)
         .          .     26:}
ROUTINE ======================== runtime.goexit in /Users/jbd/go/src/runtime/asm_amd64.s
         0     27.15s (flat, cum)   100% of Total
         .          .   2179:	RET
         .          .   2180:
         .          .   2181:// The top-most function running on a goroutine
         .          .   2182:// returns to goexit+PCQuantum.
         .          .   2183:TEXT runtime·goexit(SB),NOSPLIT,$0-0
         .     27.15s   2184:	BYTE	$0x90	// NOP
         .          .   2185:	CALL	runtime·goexit1(SB)	// does not return
         .          .   2186:	// traceback from goexit1 must hit code range of goexit
         .          .   2187:	BYTE	$0x90	// NOP
         .          .   2188:
         .          .   2189:TEXT runtime·prefetcht0(SB),NOSPLIT,$0-8
ROUTINE ======================== sync.(*Mutex).Unlock in /Users/jbd/go/src/sync/mutex.go
    27.15s     27.15s (flat, cum)   100% of Total
         .          .    121:			return
         .          .    122:		}
         .          .    123:		// Grab the right to wake someone.
         .          .    124:		new = (old - 1<<mutexWaiterShift) | mutexWoken
         .          .    125:		if atomic.CompareAndSwapInt32(&m.state, old, new) {
    27.15s     27.15s    126:			runtime_Semrelease(&m.sema)
         .          .    127:			return
         .          .    128:		}
         .          .    129:		old = m.state
         .          .    130:	}
         .          .    131:}

The fraction is automatically set in tests if -mutexprofile is used. Set the flag to write the profile to a file.

go test -mutexprofile=mutex.out

Then, the pprof tool can be used to examine the recorded profile.

go tool pprof <test.binary> mutex.out

See the runtime/pprof package for more details about profiling and the predefined profiles.

The State of Go 2017 talk also contains a guide how to run benchmarks while capturing the mutex profile. It is worth to take a look if you are willing to write benchmarks.

Happy profiling!