Debugging code generation in Go

Last week, I was at dotGo. I gave a very short lightning talk about inspection of code generation with the tools already available in the toolchain. This post goes through the talk for those who didn’t have the privilege to be at the conference. Slides are also available at go-talks.

Throughout this article, we will use the following program:

package main

import "fmt"

func main() {
    sum := 1 + 1
    fmt.Printf("sum: %v\n", sum)


Go build command encapsulates a bunch of underlying tools such as the compiler and the linker. But, it also provides more details about the build process optionally. -x is a flag that makes go build output what is being invoked. If you want to see what the components of the toolchain are, which sequence they are invoked and which flags being used, use -x.

$ go build -x
mkdir -p $WORK/hello/_obj/
mkdir -p $WORK/hello/_obj/exe/
cd /Users/jbd/src/hello
/Users/jbd/go/pkg/tool/darwin_amd64/compile -o $WORK/hello.a -trimpath $WORK -p main -complete -buildid d934a5702088e0fe5c931a55ff26bec87b80cbdc -D _/Users/jbd/src/hello -I $WORK -pack ./hello.go
cd .
/Users/jbd/go/pkg/tool/darwin_amd64/link -o $WORK/hello/_obj/exe/a.out -L $WORK -extld=clang -buildmode=exe -buildid=d934a5702088e0fe5c931a55ff26bec87b80cbdc $WORK/hello.a
mv $WORK/hello/_obj/exe/a.out hello

Intermediate assembly

In Go, there is an intermediate assembly phase before generating the actual platform-specific assembly. The compiler takes some Go files, generates the intermediate instructions and escalates it to the obj package to generate the machine code. If you are curious about what compiler generates in this phase, -S makes the compiler dump the output.

The intermediate assembly is generally a good reference to understand the cost of a Go line. Or it could be be a great reference if you want to replace, let’s say, a Go function with a more optimized assembly equivalent.

You are seeing the output for the main.main here.

$ go build -gcflags="-S"
# hello
"".main t=1 size=179 args=0x0 locals=0x60
    0x0000 00000 (/Users/jbd/src/hello/hello.go:5)	TEXT	"".main(SB), $96-0
    0x0000 00000 (/Users/jbd/src/hello/hello.go:5)	MOVQ	(TLS), CX
    0x0009 00009 (/Users/jbd/src/hello/hello.go:5)	CMPQ	SP, 16(CX)
    0x000d 00013 (/Users/jbd/src/hello/hello.go:5)	JLS	169
    0x0013 00019 (/Users/jbd/src/hello/hello.go:5)	SUBQ	$96, SP
    0x0017 00023 (/Users/jbd/src/hello/hello.go:5)	MOVQ	BP, 88(SP)
    0x001c 00028 (/Users/jbd/src/hello/hello.go:5)	LEAQ	88(SP), BP
    0x0021 00033 (/Users/jbd/src/hello/hello.go:5)	FUNCDATA	$0, gclocals·69c1753bd5f81501d95132d08af04464(SB)
    0x0021 00033 (/Users/jbd/src/hello/hello.go:5)	FUNCDATA	$1, gclocals·e226d4ae4a7cad8835311c6a4683c14f(SB)
    0x0021 00033 (/Users/jbd/src/hello/hello.go:7)	MOVQ	$2, "".autotmp_1+64(SP)
    0x002a 00042 (/Users/jbd/src/hello/hello.go:7)	MOVQ	$0, "".autotmp_0+72(SP)
    0x0033 00051 (/Users/jbd/src/hello/hello.go:7)	MOVQ	$0, "".autotmp_0+80(SP)
    0x003c 00060 (/Users/jbd/src/hello/hello.go:7)	LEAQ, AX
    0x0043 00067 (/Users/jbd/src/hello/hello.go:7)	MOVQ	AX, (SP)
    0x0047 00071 (/Users/jbd/src/hello/hello.go:7)	LEAQ	"".autotmp_1+64(SP), AX
    0x004c 00076 (/Users/jbd/src/hello/hello.go:7)	MOVQ	AX, 8(SP)
    0x0051 00081 (/Users/jbd/src/hello/hello.go:7)	PCDATA	$0, $1
    0x0051 00081 (/Users/jbd/src/hello/hello.go:7)	CALL	runtime.convT2E(SB)
    0x0056 00086 (/Users/jbd/src/hello/hello.go:7)	MOVQ	16(SP), AX
    0x005b 00091 (/Users/jbd/src/hello/hello.go:7)	MOVQ	24(SP), CX
    0x0060 00096 (/Users/jbd/src/hello/hello.go:7)	MOVQ	AX, "".autotmp_0+72(SP)
    0x0065 00101 (/Users/jbd/src/hello/hello.go:7)	MOVQ	CX, "".autotmp_0+80(SP)
    0x006a 00106 (/Users/jbd/src/hello/hello.go:7)	LEAQ	go.string."sum: %v\n"(SB), AX
    0x0071 00113 (/Users/jbd/src/hello/hello.go:7)	MOVQ	AX, (SP)
    0x0075 00117 (/Users/jbd/src/hello/hello.go:7)	MOVQ	$8, 8(SP)
    0x007e 00126 (/Users/jbd/src/hello/hello.go:7)	LEAQ	"".autotmp_0+72(SP), AX
    0x0083 00131 (/Users/jbd/src/hello/hello.go:7)	MOVQ	AX, 16(SP)
    0x0088 00136 (/Users/jbd/src/hello/hello.go:7)	MOVQ	$1, 24(SP)
    0x0091 00145 (/Users/jbd/src/hello/hello.go:7)	MOVQ	$1, 32(SP)
    0x009a 00154 (/Users/jbd/src/hello/hello.go:7)	PCDATA	$0, $1
    0x009a 00154 (/Users/jbd/src/hello/hello.go:7)	CALL	fmt.Printf(SB)
    0x009f 00159 (/Users/jbd/src/hello/hello.go:8)	MOVQ	88(SP), BP
    0x00a4 00164 (/Users/jbd/src/hello/hello.go:8)	ADDQ	$96, SP
    0x00a8 00168 (/Users/jbd/src/hello/hello.go:8)	RET
    0x00a9 00169 (/Users/jbd/src/hello/hello.go:8)	NOP
    0x00a9 00169 (/Users/jbd/src/hello/hello.go:5)	PCDATA	$0, $-1
    0x00a9 00169 (/Users/jbd/src/hello/hello.go:5)	CALL	runtime.morestack_noctxt(SB)
    0x00ae 00174 (/Users/jbd/src/hello/hello.go:5)	JMP	0

If you would like to learn more insights about the intermediate assembly and why it is significant in Go, I highly recommend Rob Pike’s The Design of the Go Assembler from GopherCon this year.


As I mentioned, -S is only for the intermediate assembly. The actual machine representation is available in the final artifact. You can use a disassembler to examine what’s inside. Use go tool objdump with a binary or library. You probably also want to use -s to focus by symbol name. In this example, I am dumping the main.main. This is the actual assembly generated for darwin/amd64.

$ go tool objdump -s main.main hello
TEXT main.main(SB) /Users/jbd/src/hello/hello.go
    hello.go:5	0x2040	65488b0c25a0080000  GS MOVQ GS:0x8a0, CX
    hello.go:5	0x2049	483b6110            CMPQ 0x10(CX), SP
    hello.go:5	0x204d	0f8696000000        JBE 0x20e9
    hello.go:5	0x2053	4883ec60            SUBQ $0x60, SP
    hello.go:5	0x2057	48896c2458          MOVQ BP, 0x58(SP)
    hello.go:5	0x205c	488d6c2458          LEAQ 0x58(SP), BP
    hello.go:7	0x2061	48c744244002000000  MOVQ $0x2, 0x40(SP)
    hello.go:7	0x206a	48c744244800000000  MOVQ $0x0, 0x48(SP)
    hello.go:7	0x2073	48c744245000000000  MOVQ $0x0, 0x50(SP)
    hello.go:7	0x207c	488d053d4d0800      LEAQ 0x84d3d(IP), AX


Sometimes, all you need is to check the symbols rather than understand the code or data sections. Similar to the general-purpose nm tool, Go distributes an nm which allows you to list the symbols in an artifact with annotations and size. Pretty handy if you want to see what’s in and being exported from a Go binary or library.

$ go tool nm hello
f4760 B __cgo_init
f4768 B __cgo_notify_runtime_init_done
f4770 B __cgo_thread_start
4fb70 T __rt0_amd64_darwin
4e220 T _gosave
4fb90 T _main
ad1e0 R _masks
4fd00 T _nanotime
4e480 T _setg_gcc
ad2e0 R _shifts
624a0 T errors.(*errorString).Error
62400 T errors.New
52470 T fmt.(*buffer).WriteRune


Optimizations: SSA steps

With the contribution of the new SSA backend, the team contributed a tool that visualizes all the SSA passes. Set the GOSSAFUNC env variable with a func name and run the go build command. It will generate an ssa.html file that shows all the steps the compiler has gone through to optimize your code.

$ GOSSAFUNC=main go build && open ssa.html

Here, there is a visualization of all the passes applied to the main function.

SSA optimizations

Optimizations: Escape analysis

The Go compiler can also annotate the inlining and escape analysis. If you pass the -m=2 flag to the compiler, it will output the optimizations or annotations related to these two. Here we see the net/context package related inlining operations and escape analysis.

$ go build -gcflags="-m"
../ can inline Background as: func() Context { return background }
../ can inline TODO as: func() Context { return todo }
../ cannot inline WithCancel: non-leaf function
../ cannot inline WithDeadline: non-leaf function
../ cannot inline WithTimeout: non-leaf function
../ inlining call to time.Time.Add method(time.Time) func(time.Duration) time.Time { time.t·2.sec += int64(time.d·3 / time.Duration(1000000000)); var time.nsec·4 int32; time.nsec·4 = <N>; time.nsec·4 = time.t·2.nsec + int32(time.d·3 % time.Duration(1000000000)); if time.nsec·4 >= int32(1000000000) { time.t·2.sec++; time.nsec·4 -= int32(1000000000) } else { if time.nsec·4 < int32(0) { time.t·2.sec--; time.nsec·4 += int32(1000000000) } }; time.t·2.nsec = time.nsec·4; return time.t·2 }
../ cannot inline WithValue: non-leaf function
../ background escapes to heap
../ 	from ~r0 (return) at ../
../ todo escapes to heap
../ 	from ~r0 (return) at ../
../ parent escapes to heap
../ 	from parent (passed to function[unknown]) at ../
../ leaking param: parent
../ 	from parent (interface-converted) at ../
../ 	from parent (passed to function[unknown]) at ../
../ parent escapes to heap
../ 	from parent (passed to function[unknown]) at ../
../ leaking param: parent
../ 	from parent (interface-converted) at ../
../ 	from parent (passed to function[unknown]) at ../
../ leaking param: deadline
../ 	from deadline (passed to function[unknown]) at ../
../ ctx escapes to heap
../ 	from ~r2 (return) at ../
../ leaking param: parent
../ 	from parent (passed to function[unknown]) at ../
../ parent escapes to heap
../ 	from parent (passed to function[unknown]) at ../
../ leaking param: parent
../ 	from parent (interface-converted) at ../
../ 	from parent (passed to function[unknown]) at ../
../ leaking param: key
../ 	from key (passed to function[unknown]) at ../
../ leaking param: val
../ 	from val (passed to function[unknown]) at ../
../ context.WithValue(parent, key, val) escapes to heap
../ 	from ~r3 (return) at ../
<autogenerated>:1: leaking param: .this
<autogenerated>:1: 	from .this.Deadline() (receiver in indirect call) at <autogenerated>:1
<autogenerated>:2: leaking param: .this
<autogenerated>:2: 	from .this.Done() (receiver in indirect call) at <autogenerated>:2
<autogenerated>:3: leaking param: .this
<autogenerated>:3: 	from .this.Err() (receiver in indirect call) at <autogenerated>:3
<autogenerated>:4: leaking param: key
<autogenerated>:4: 	from .this.Value(key) (parameter to indirect call) at <autogenerated>:4
<autogenerated>:4: leaking param: .this
<autogenerated>:4: 	from .this.Value(key) (receiver in indirect call) at <autogenerated>:4

You can use -m to see a less verbose output without reasonings, but David Chase says even though -m=2 is not perfect, it is often useful.

Optimizations: Disabling optimizations

It is worth it to mention that you may prefer to disable optimizations to insepect certain cases because optimization can change the sequence of operations, add code, remove code or apply transformations to the code. With optimizations, it may get harder to debug and insect certain cases. Disabling optimizations is possible with -N, and disabling inlining is possible with -l.

$ go build -gcflags="-l -N"

Once optimizations are disabled, you can debug without being affected by transformations.


If you are working on the lexer, the compiler provides a flag to debug the lexer as it goes through the source files.

$ go build -gcflags="-x"
# hello
lex: ident main
lex: implicit semi
lex: string literal
lex: implicit semi
lex: FUNC
lex: ident main
./hello.go:5 lex: TOKEN '('
./hello.go:5 lex: TOKEN ')'
./hello.go:5 lex: TOKEN '{'
lex: ident sum
./hello.go:6 lex: TOKEN COLAS
lex: integer literal
./hello.go:6 lex: TOKEN '+'
lex: integer literal
lex: implicit semi
lex: ident fmt
./hello.go:7 lex: TOKEN '.'
lex: ident Printf
./hello.go:7 lex: TOKEN '('
lex: string literal
./hello.go:7 lex: TOKEN ','
lex: ident sum
./hello.go:7 lex: TOKEN ')'
lex: implicit semi
./hello.go:8 lex: TOKEN '}'
lex: implicit semi