avatarVincent Blanchon

Summary

The provided content discusses optimization techniques in Go for converting byte arrays to strings, emphasizing runtime and compiler optimizations that avoid unnecessary memory allocation and copying.

Abstract

In Go programming, converting bytes to strings can be resource-intensive due to memory allocation and data copying. The article, based on Go 1.14, explores how Go handles such conversions and the optimizations available to improve performance. It explains that a simple conversion triggers an allocation on the heap and a copy of the bytes into a new string, but Go provides runtime optimizations for single-byte conversions by referencing a static array. The Go compiler also optimizes certain cases, such as string comparisons in switch statements, map access, string concatenation, and string comparisons, by avoiding the need for conversion and working directly with the original byte array. This results in more efficient code by eliminating unnecessary allocations and comparisons.

Opinions

  • The author suggests that converting bytes to strings solely to meet code constraints is wasteful and should be optimized.
  • Escape analysis is recommended for developers to understand memory allocation behavior, with a reference to another article by the author for deeper insight.
  • The author values the Go runtime and compiler optimizations, particularly the ability to bypass string conversion when the content is only being compared or used as a map key.
  • The article implies that understanding Go's memory management and string handling is crucial for writing efficient Go code.
  • By providing examples and assembly code, the author conveys a strong endorsement of Go's underlying efficiency mechanisms for string manipulation.

Go: String & Conversion Optimization

Illustration created for “A Journey With Go”, made from the original Go Gopher, created by Renee French.

ℹ️ This article is based on Go 1.14.

In Go, converting an array of bytes to a string could involve a memory allocation along with a copy of the converted string. However, converting bytes to a string just to satisfy a code constraint, such as a comparison in a switch statement or as a key in a map, is definitely a waste of CPU time. Let’s review some cases and the optimizations done.

Conversion

The conversion from an array of bytes to a string involves:

  • An allocation for the new string on the heap if the variable outlives the current stack frame.
  • A copy of the bytes to the string.

For more details about the escape analysis, I suggest reading my article “Go: Introduction to the Escape Analysis.”

Here is a simple program that goes through those two steps:

Here is a diagram of that conversion:

If you want to know more about the copy function, I suggest you read my article “Go: Slice and Memory Management.”

At the runtime, Go provides only one optimization during the conversion. If the converted array of bytes contains actually one byte, the returned string will point to a static array of byte embedded in the runtime:

However, if this string is modified later, it will allocate memory from the heap before assigning the new value.

The Go compiler also provides some optimizations that can skip the two phases of the conversion we have seen.

Switch

Let’s start with an example of conversion to string for a comparison purpose:

The examples used to illustrate the string optimizations will force the allocation on the heap by using the getBytes function. It avoids some other compiler optimizations that could hide the string optimizations introduced here.

In this example, the conversion is used for the switch instruction only, and Go is able to avoid the conversion since it just needs to compare the actual content. Go actually optimizes the code by removing the conversion and pointing directly to the backed array of bytes:

We can also see the exact optimization with the generated assembly:

Go uses the returned bytes directly in the comparison. It checks the size of the first case statement with the array of bytes and then the string itself. Assigning the string outside of the switch would lead to an allocation since the compiler would not know where this is string used later.

Optimizations

The instruction switch is not the only case optimized with the string conversions. Go compiler applies this behavior to other cases such as:

  • Accessing to an element of a map. Here is an example:

While accessing the map, there is actually no conversion needed making the access faster.

  • String concatenation. Here is an example:

The concatenation with an array of bytes and some string does not lead to any allocation or conversion of the bytes. The concatenation will refer directly to the backed array, as seen previously.

  • String comparisons. Here are some examples:

This case is similar to the switch. It compares first the size of the string and the size of the array of bytes, and then compare the strings.

Golang
Go
Internals
Recommended from ReadMedium