avatarVincent Blanchon

Summary

The undefined website provides insights into the Go programming language's runtime tracing tool, which allows developers to monitor and analyze program execution, memory allocation, garbage collection, and other events, with the option to add custom traces for more detailed profiling.

Abstract

The Go programming language offers a robust tracing tool that can be activated during runtime to provide developers with a detailed execution profile of their applications. This tool captures a wide range of events, including memory allocations, garbage collection phases, and goroutine activity, by recording them statically and displaying them in a user-friendly format. The tracing process begins with a "stop the world" phase to snapshot current goroutines and their states, followed by event collection into buffers, and finally, the output is generated by a dedicated goroutine. The traces can be visualized using the go tool trace command, revealing various system activities such as processor association, system calls, and goroutine unblocking. Additionally, Go's trace package allows developers to define custom tasks and regions, enriching the trace data with user-defined logs and hierarchical tasks for more granular analysis. While tracing can introduce performance overhead, tools like StackDriver offer continuous profiling with minimal impact, making it suitable for production environments.

Opinions

  • The author suggests that the Go tracing tool is powerful and can be made even more potent with the addition of user-defined traces.
  • The article implies that understanding the "stop the world" phase and the garbage collector's workings is crucial for a deeper comprehension of Go's tracing capabilities.
  • The author provides a subjective assessment that the visualization of traces is straightforward and can be enhanced with custom logs and tasks, which they seem to advocate for better application insights.
  • There is an opinion that while tracing can impact performance, the benefits of detailed profiling outweigh the costs, especially with tools that minimize the overhead.

Go: Discovery of the Trace Package

Illustration created for “A Journey With Go”, made from the original Go Gopher, created by Renee French.

ℹ️ This article is based on Go 1.13.

Go provides us a tool to enable tracing during the runtime and get a detailed view of the execution of our program. This tool can be enabled by flag -trace with the tests, frompprof to get live tracing, or anywhere in our code thanks to the trace package. This tool can be even more powerful since you can enhance it with your own traces. Let’s review how it works.

Flow

The flow of the tool is quite straightforward. Each event, such as memory allocation; all the phases of garbage collector; goroutines when they run, pause, etc. is statically recorded by the Go standard library and formatted to be displayed later. However, before the recording starts, Go first “stops the world” and takes a snapshot of the current goroutines and their states.

You can find more details about this phase in my article “Go: How Does Go Stop the World?

This will later allow Go to construct the cycle of life of each goroutine properly. Here is the flow:

Initialization phase before tracing

Then, collected events are pushed to a buffer that is later flushed to a list of full buffers when the max capacity is reached. Here is a diagram of this flow:

Tracing collect events per P

The tracer now needs a way to dump those traces to the output. For that, Go spawns a goroutine dedicated to that when the tracing starts. This goroutine will dump data when available and will park the goroutine until the next ones. Here is a representation of it:

A dedicated goroutine reads and dump the traces

The flow is now pretty clear, so let’s review the recorded traces.

Traces

Once the tracing is generated, the visualization can be done with running the command go tool trace my-output.out. Let’s take some traces as example:

Tracing from go tool

Most of them are quite straightforward. The traces related to the garbage collector stand under the blue trace GC:

Traces of the garbage collector

Here is a quick review:

  • STW are the two “Stop the World” phases in the garbage collector. During these two phases the goroutines are stopped.
  • GC (idle) are the goroutines that mark the memory when they do not have work to do.
  • MARK ASSIST re the goroutines that help mark the memory during allocations.
  • GXX runtime.bgsweep is the memory sweep phase once the garbage collector is done.
  • GXX runtime.gcBgMarkWorker are the dedicated background goroutines that help mark the memory.

You can find more information about those traces in my article “Go: How Does the Garbage Collector Watch Your Application?

However, some traces are not easy to understand. Let’s review them to get a better understanding:

  • proc start is called when a processor is associated with a thread. It happens when a new thread is started or when we resume from a syscall.
  • proc stop is called when a thread is disassociated from the current processor. It happens when the thread is blocked in a syscall or when a thread exits.
  • syscall is called when a goroutine is making a system call:
  • unblock is called when a goroutine is unblocked from a syscall — the label (sysexit) would be displayed in that case, from a blocked channel, etc:

The traces can be enhanced since Go allows you to define and visualize your own trace along with the ones from the standard library.

User traces

The traces we can define have two hierarchical levels:

  • at the top level with the task, with a start and an end.
  • at a sub level with the region.

Here is a simple example of them:

Those new traces can be visualized directly from the tool via the menu User-defined tasks:

Custom task and regions

It’s also possible to record some arbitrary log to the task:

ctx, task := trace.NewTask(context.Background(), "main start")
trace.Log(ctx, "category", "I/O file")
trace.Log(ctx, "goroutine", "2")

Those logs will be found under the goroutine that sets up the task:

Custom logs in the tracing

The tasks can also be embedded in other tasks by deriving the context of the parent task and so on. However, tracing all those events live from production — thanks to pprof — could slightly deteriorate the performance while you are collecting them.

Performance impact

A simple benchmark could help to understand the impact of the tracing. One will run with the flag -trace and another one without. Here are the results with a benchmark of the function ioutil.ReadFile() that generates quite a lot of events:

name         time/op
ReadFiles-8  48.1µs ± 0%
name         time/op
ReadFiles-8  63.5µs ± 0%  // with tracing

In this case, the impact is about ~35% and could vary depending on the application. However, some tools such as StackDriver allows having continuous profiling on production while keeping a light overhead on our application.

Golang
Go
Debug Instrumentation
Internals
Recommended from ReadMedium