The article discusses the importance of thread safety in Golang when using goroutines to modify shared memory, and demonstrates how to use mutexes to prevent concurrent access issues.
Abstract
Golang's goroutines provide an efficient solution for concurrency, but they require careful handling to ensure thread safety when accessing shared memory. The article illustrates a common issue where concurrent goroutines without proper synchronization can lead to unpredictable results, such as a decreasing population count when only births are being tracked. It then introduces mutexes, specifically sync.Mutex and sync.RWMutex, as mechanisms to lock access to shared resources, ensuring that only one goroutine can modify or read the value at a time. The article also touches on the differences in concurrency handling between Golang and other languages like Python and JavaScript, emphasizing Golang's ability to efficiently manage both OS threads and goroutines for IO-bound tasks.
Opinions
The author suggests that improper use of goroutines without thread safety can lead to "nasty bugs."
Mutexes are presented as a solution to enforce orderly access to shared resources, preventing "random accessing and changing of values."
The use of defer c.mu.Unlock() is recommended to avoid deadlocks, ensuring that a mutex is always unlocked even if an error occurs.
The article points out that sync.RWMutex can improve performance by allowing multiple readers simultaneously, provided no writer is active.
The author expresses that Python's Global Interpreter Lock (GIL) acts as a large mutex, making explicit locks less impactful in CPython.
JavaScript's single-threaded nature is noted, with the suggestion to use languages like C++ or Rust for compute-heavy, time-sensitive operations.
Golang is praised for its ability to handle both OS threads and goroutines effectively, making it suitable for modern backend servers that are predominantly IO-bound.
Thread Safety in Golang
Goroutines are awesome. They are arguably the best solution for concurrency out there. However, your applications can get very unpredictable, and cause nasty bugs if you don’t use goroutines properly. One of the most important things to keep in mind is that if your goroutines are going to modify the state of any value stored in a memory address, you need to make them thread safe ( keep in mind that goroutines are actually just cheap threads) .
What exactly do I mean by this. Let me explain through an example:
Here is a very simple example. Say we have an application that is tracking the population of countries in real time. The population of a country is dynamic, babies get born all the time, and sadly some people pass away all the time.
The USA is a huge country, which consists of 50 states. So, for the sake of my example, this updatePopulation will simulate a single call to some Census Agency of every state. Since we're in a cheerful mood (and also because I need it to prove my point), we'll only register the newborns and not the departed.
Since we want this app to truly be a real-time tracker, we don’t want to make synchronous API calls, we want to make them concurrently by using goroutines, hence the go in front of the updatePopulation function call.
That should do it, right? Let’s see the result:
New population of USA is32001199New population of USA is32001030New population of USA is32001206New population of USA is32001176New population of USA is32000957New population of USA is32001005
Hmm.. If we are only tracking newborns, and not tracking the deceased… how come our population is actually lower in the last line than the first line?
Well, it’s because we haven’t implemented thread safety. Our various goroutines are accessing the same memory address without any respect for order. It’s like those old people at the supermarket that pretend they don’t see the line.
We need to implement some order here.
Enter Mutexes.
Mutex is short for Mutually Exclusive Lock. It’s used so that when one thread (or goroutine in the case of Golang) is accessing a value inside a memory address, it can lock out the other threads so they have to wait in line. This guarantees that there will not be any of this random accessing and changing of values. Let’s implement:
The sync.Mutex is a struct that we use for implementing mutexes in Go. The default value is an unlocked mutex, as you can see from the standard library code:
// A Mutex is a mutual exclusion lock.// The zero value for a Mutex is an unlocked mutex.//// A Mutex must not be copied after first use.type Mutex struct {
state int32
sema uint32
}
Now, let’s run our app again and see what the result will be:
New population of USA is32000979New population of USA is32001023New population of USA is32001072New population of USA is32001108New population of USA is32001146New population of USA is32001185New population of USA is32001225
This looks good. The values are constantly incrementing, meaning that order is restored. Now, you might wonder why we use the defer c.mu.Unlock() code immediately under the line of code where we establish the lock. The reason for this is because we need to avoid a deadlock. Deadlocks are vulnerabilities of mutexes that must be avoided at all cost. Imagine something happens between the locking of a memory address and it's unlocking that causes the goroutine to stop. It would mean that this lock is going to be implemented indefinitely and that all of the other goroutines will not be able to access it at all. This is why we use the defer keyword, because it will guarantee that no matter what happens in that function, it will execute the unlock after exiting.
It is also important to mention that beside the standard sync.Mutex that we used above, there also exists another mutex - sync.RWMutex .
The point is that each time a goroutine implements a lock, the other goroutines have to wait in line, thus slowing down the overall performance. But, what if these goroutines just want to read the value from that memory address. That’s safe, right? Why would the goroutine that implemented the lock hog the memory address all to itself, if the others promise not to change anything, just read the value and go on with their business.
So if you change the sync.Mutex field in the Country struct to be sync.RWMutex , you now have the possibility of getting even more functionality:
1. Lock(): only one go routine reads/writes at a time by acquiring the lock.
2. RLock(): multiple go routines can read(not write) at a time by acquiring the lock.
Finally, I want to address the fact that most of you who were coming from Python and JavaScript like me, were probably shocked to even learn that something like a Mutex even exists. The reason we never heard of this concept from our dear dynamically typed, interpreted languages is because in the case of Python (CPython to be specific) the GIL (Global Interpreter Lock) acts a a gigantic Mutex over everything, so using explicit locks wouldn’t really change anything. Although, you can implement a Lock and Unlock by using the threading library, but even the official documentation says:
CPython implementation detail: In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing or concurrent.futures.ProcessPoolExecutor. However, threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously.
Basically they are saying — just use the library for IO (but I would suggest using asyncio), and if you really need OS threads, Python probably isn’t that great of an option.
For JavaScript devs, the reason is even simpler — you never used mutexes, because JavaScript has only one thread. And, guess what, it does it’s job pretty well with it. Once again, if you are doing compute heavy operations that are time sensitive, go with C++ or Rust. Luckily, most modern backend servers don’t need compute heavy operations, they are totally IO based, and spend most of their idle time waiting for a database server or a third party API to return some input. Golang is beautiful because it has the ability to use both OS threads and goroutines.
That’s all for this article. If I missed anything please leave a comment.
If you enjoyed reading, you can support me by subscribing to Medium with this link . You will gain access to all articles written on Medium.