avatarVincent Blanchon

Summary

The provided web content discusses memory safety in Go, specifically focusing on how the language uses bounds checking to prevent out-of-range memory access, and how the compiler optimizes these checks.

Abstract

Go's memory safety features include automatic bounds checking, which is a compile-time mechanism that ensures array or slice accesses stay within the defined range. This prevents runtime errors such as accessing invalid memory locations. The article, based on Go 1.13, illustrates how Go introduces control points in the code to validate memory access. It demonstrates this with an example that causes a panic when accessing an index out of the slice's range. The article also explains how to view these bounds checks in the generated assembly code using go tool compile -S main.go and how to run the program without bounds checks for debugging purposes using the -gcflags="-B" flag. Additionally, it covers the SSA (Static Single Assignment) form in the Go compiler, which optimizes these checks by eliminating unnecessary ones, thus improving performance without compromising safety.

Opinions

  • The Go compiler is designed to balance safety and performance by introducing bounds checks and optimizing them away when they are provably unnecessary.
  • The article suggests that while developers can run Go programs without bounds checks for debugging, it is generally not recommended as it can lead to undefined behavior and security vulnerabilities.
  • The author implies that the Go team has put significant effort into improving the compiler's ability to optimize bounds checks, which can lead to performance gains in Go programs.
  • It is highlighted that the SSA form and associated compiler passes (like check_bce and prove) are sophisticated mechanisms that contribute to Go's efficient memory management.
  • The article assumes that readers are familiar with Go's syntax and assembly code, as well as compiler optimization concepts, indicating that the content is targeted at an audience with some programming and compiler knowledge.

Go: Memory Safety with Bounds Check

Illustration created for “A Journey With Go”, made from the original Go Gopher, created by Renee French.

ℹ️ This article is based on Go 1.13.

Go makes the lives of the developers easier with internal memory managements such as allocations, garbage collections, and memory access checks. The compiler manages this latter point by introducing “Bounds Check” points in the code to guarantee safe access to the memory.

Generated instructions

Go introduces some control points to make sure our programs access a valid memory segment. Let’s start with a basic example:

func main() {
   list := []int{1, 2, 3}

   printList(list)
}

func printList(list []int) {
   println(list[2])
   println(list[3])
}

Running this code makes the program panic:

3
panic: runtime error: index out of range [3] with length 3

Go prevents incorrect memory access by adding check bounds access.

If you wonder what the issue would be without those checks, you can run your program without them by using the flag -gcflags="-B". Here is the output:

3
824633993168

Since the memory is not valid, it will read the next bytes that do not belong to the slice.

Those checkpoints can be visualized from the generated asm via the command go tool compile -S main.go:

0x0021 00033 (main.go:10)  MOVQ   "".list+48(SP), CX
0x0026 00038 (main.go:10)  CMPQ   CX, $2
0x002a 00042 (main.go:10)  JLS    161
[...] here Go prints the third element
0x0057 00087 (main.go:11)  MOVQ   "".list+48(SP), CX
0x005c 00092 (main.go:11)  CMPQ   CX, $3
0x0060 00096 (main.go:11)  JLS    151
[...]
0x0096 00150 (main.go:12)  RET
0x0097 00151 (main.go:11)  MOVL   $3, AX
0x009c 00156 (main.go:11)  CALL   runtime.panicIndex(SB)
0x00a1 00161 (main.go:10)  MOVL   $2, AX
0x00a6 00166 (main.go:10)  CALL   runtime.panicIndex(SB)

Go first checks the length of the slice with the instruction MOVQ that loads the length of the slice to the register CX:

0x0021 00033 (main.go:10)  MOVQ   "".list+48(SP), CX

As a reminder, the slices are composed by the pointer to the underlying array, the length of the slice, and its capacity. Here is an illustration of our slice in the stack:

The length is accessible by shifting the stack pointer by 48 bytes.

The next instruction compares the length of the slice to the index the program is trying to access:

The instruction CMPQ actually subtracts the two values, and then compares it to zero in the next instruction. If the length of the slice (register CX) minus the index (two in our example) is less than or equal to zero (JLS stands for Jump on lower or the same), we jump to instruction 161 that panics:

Both of the bounds checks use the same instructions. Rather than looking at the generated asm, Go offers a pass during the compilation that prints out the bounds check thanks to the flag -gcflags="-d=ssa/check_bce/debug=1" you can use with the build or run command. Here is the output:

./main.go:10:14: Found IsInBounds
./main.go:11:14: Found IsInBounds

We also see in this output the two generated checkpoints. However, Go is smart enough not to generate instructions at every single access if it is not needed.

Rules

Generating a checkpoint at every single instruction that has access to the map could be inefficient. Let’s slightly modify the previous example to illustrate that:

func main() {
   list := []int{1, 2, 3}

   printList(list)
}

func printList(list []int) {
   println(list[3])
   println(list[2])
}

The two instructions println have been switched here. Running this program with the check_bce flag gives this time only one bound check:

./main.go:11:14: Found IsInBounds

Since the program checks the index 3 first, if that one is valid, the index 2 will be obviously valid as well and does not need any bound check. The compilation passes can be visualized from the SSA code generation and the command GOSSAFUNC=printList go run main.go. Here is a generated SSA code with bound checking for each instruction:

Then, the prove pass marks the bounds check to be removed, while a future pass will collect this dead code:

The logic behind this compiler pass can be printed out with the command GOSSAFUNC=printList go run -gcflags="-d=ssa/prove/debug=3" main.go. That will also generate an SSA file that will help you to debug the output of the pass. Here is the output:

This pass actually follows the different paths and builds a table of facts. Then, according to the facts, it determines what the contradictions are. In our example, we can follow those rules with the SSA pass:

The first phase starts with analyzing the block b1 that represents the line println(list[3]). From that instruction, there are now two possibilities:

  • The index [3] is in bounds, and we will go to b2, the second instruction. In this case, Go determines that the limit for v7 (the length of the slice) is [4, max(int)].
  • The index [3] is not in bounds, and we will go the block b3 that will panic.

Then, Go moves to the second instruction represented by the block b2. From here, two possibilities:

  • The index [2] is in bounds, meaning that the length of the slice v7 should be greater than v23 (index [2]). This is confirmed since Go registered that v7 > 4 in the previous block.
  • The index [2] is not in bounds, meaning it is greater than v7 the length of the slice. Since the limit for v7 is [4, max(int)], Go flags it as a contradiction. This contradiction makes the conditions impossible to meet, meaning the bounds check for this instruction can be removed.

This pass has been improved over time and now detects most of the cases. Removing useless bound checks can slightly speed up the Go programs, but unless you work in an environment where every microsecond matters, you should never have to optimize these bound checks by yourself.

Golang
Go
Internals
Compiler
Recommended from ReadMedium
avatarVenkatesh Subramanian
From HTTP/0.9 to HTTP/3: Transport Performance

5 min read