Profiler labels in Go

Go 1.9 is introducing profiler labels, a way to add arbitrary key-values to the samples collected by the CPU profiler. CPU profilers collect and output hot spots where the CPU spent most time in when executing. A typical CPU profiler output is primarily reports the location of these spots as function name, source file/line, etc. By looking at the data, you can also examine which parts of the code invoked these spots. You can also filter by invokers to have more granular understanding of certain execution paths.

Even though locality information is useful to spot expensive execution paths, it is not always essentially enough when debugging a performance problem. A significant percentage of Go programmers uses Go to write servers, and it is even more complex to point out performance issues in a server. It is hard to isolate certain execution paths from others, or hard to understand whether it is only a certain path creating trouble (e.g. a user or a specific handler).

With 1.9, Go is introducing a new feature that allows you to record additional information to provide more context about the execution path. You will be able to record any set of labels, as a part of the profiling data. Then, use these labels to examine the profiler output more precisely.

You can benefit from profiler labels in many cases. Some of the obvious ones:

Adding labels

The runtime/pprof package will export several new APIs to let users add labels. Most users will use Do which takes a context, extends it with labels, records these labels when f is executing:

func Do(ctx context.Context, labels LabelSet, f func(context.Context))

Do only set the given label set during the execution of the current goroutine. If you want to start goroutines in f, you can propagate the labels by passing the context argument of the function.

labels := pprof.Labels("worker", "purge")
pprof.Do(ctx, labels, func(ctx context.Context) {
    // Do some work...

    go update(ctx) // propagates labels in ctx.

The work above will be labeled with worker:purge.

Examining the profiler output

This section will demonstrate how to examine the recorded samples by profiler labels. Once you annotate your code with labels, it is time to profile and consume the profiler data with tag filters.

I will use the net/http/pprof package to capture samples in this demo, see the Profiling Go programs article for more options.

package main

import _ "net/http/pprof"

func main() {
    // All the other code...
    log.Fatal(http.ListenAndServe("localhost:5555", nil))

Collect some CPU samples from the server.

$ go tool pprof http://localhost:5555/debug/pprof/profile

Once the interactive mode starts, you can list the recorded labels by the tags command. Note that pprof tools call them tags even though they are named labels in the Go standard library.

(pprof) tags
http-path: Total 80
        70 (87.50%): /messages
        10 (12.50%): /user

worker: Total 158
       158 (  100%): purge

As you can see, there are two label keys (http-path, worker) and several values recorded for each. http-path key is coming from HTTP handlers I annotated, and worker:purge is originated at the code above.

By filtering by labels, we can focus only on the samples collected from the /user handler.

(pprof) tagfocus="http-path:/user"
(pprof) top10 -cum
Showing nodes accounting for 0.10s, 3.05% of 3.28s total
      flat  flat%   sum%        cum   cum%
         0     0%     0%      0.10s  3.05%  main.generateID.func1 /Users/jbd/src/hello/main.go
     0.01s   0.3%   0.3%      0.08s  2.44%  runtime.concatstring2 /Users/jbd/go/src/runtime/string.go
     0.06s  1.83%  2.13%      0.07s  2.13%  runtime.concatstrings /Users/jbd/go/src/runtime/string.go
     0.01s   0.3%  2.44%      0.02s  0.61%  runtime.mallocgc /Users/jbd/go/src/runtime/malloc.go
         0     0%  2.44%      0.02s  0.61%  runtime.slicebytetostring /Users/jbd/go/src/runtime/string.go
         0     0%  2.44%      0.02s  0.61%  strconv.FormatInt /Users/jbd/go/src/strconv/itoa.go
         0     0%  2.44%      0.02s  0.61%  strconv.Itoa /Users/jbd/go/src/strconv/itoa.go
         0     0%  2.44%      0.02s  0.61%  strconv.formatBits /Users/jbd/go/src/strconv/itoa.go
     0.01s   0.3%  2.74%      0.01s   0.3%  runtime.memmove /Users/jbd/go/src/runtime/memmove_amd64.s
     0.01s   0.3%  3.05%      0.01s   0.3%  runtime.nextFreeFast /Users/jbd/go/src/runtime/malloc.go

The listing contains only samples labeled with http-path:/user. So we can easily understand the most expensive execution paths from the user handler.

You can also use tagshow, taghide, and tagignore commands as other filtering options. For example, tagignore allows you to match anything but the given regex. The filter below will match anything but the user handler; worker:purge and http-path:/messages in this case.

(pprof) tagfocus=
(pprof) tagignore="http-path:/user"
(pprof) tags
http-path: Total 70
        70 (  100%): /messages

worker: Total 158
       158 (  100%): purge

If you visualize the filtered samples, the output will show how much each label is contributing to the final cost.

Pprof labels visualized

You can see that worker:purge used 0.07s, and messages handler used 0.03s in the generateID function during the collection of the profiling data.

Try it yourself!

Profiler labels allows us to add additional information to the profiler data that is not available at the current execution stack. Try them by downloading the Go 1.9 beta if you need more dimensions in your profiler output. Also, try the pprofutil package to automatically add HTTP path label to your handlers.