Illustration created for “A Journey With Go”, made from the original Go Gopher, created by Renee French.

ℹ️ This article is based on Go 1.13.

Goroutines are easy to create, have a small stack, and a fast context switch. For those reasons, developers love them and use them a lot. However, a program that spawns many short living goroutines will spend quite some time creating and destroying them.

Cycle of life

Let’s start with a simple example that will show how goroutines are reused. I will take the prime number calculation from the Go documentation:

Hundreds of goroutines will be used to filter the numbers forcing Go to manage the creation and the destruction of those goroutines. Actually, Go maintains a list of free goroutines per P :

Keeping this list local per P brings the advantage not to use any lock to push or get a free goroutine. Then, when a goroutine exits from its current work, it will be pushed to this free list:

However, in order to better distribute the free goroutines, the scheduler also has its own lists. It actually has two lists: one that contain the goroutine with an allocated stack, and another one that keeps the goroutines with stack freed — this detail will be explained in the next section. Here is the representation:

A lock protects the central list since any thread could access it. The list held by the scheduler acquires goroutines from the P ’s when the local list length exceeds 64. Then, half of the goroutines will move to the central list:

However, though it looks quite straightforward, this workflow has some rules regarding the memory allocation of the goroutines.

Requirements

Recycling goroutines is a great way to save the cost of their allocation. However, since the stack grows dynamically, the goroutines that exist could end up with a big stack according to work done. For that reason, Go does not keep the stack when it has grown, i.e., more than 2K.

In the example given, calculating the prime number is a quite light process and does not grow the stack of the goroutines, letting Go reuse them. Here are the benchmarks:

With recycling Without recycling name time/op name time/op

PrimeNumber 12.7s ± 3% PrimeNumber 12.1s ± 3%

PrimeNumber-8 2.27s ± 4% PrimeNumber-8 2.13s ± 3%



name alloc/op name alloc/op

PrimeNumber 1.83MB ± 0% PrimeNumber 5.82MB ± 4%

PrimeNumber-8 1.52MB ± 7% PrimeNumber-8 5.90MB ±21%

We should note there is no native way to disable it; I disable the feature directly in the Go standard library. In that example, reusing the goroutines reduced the allocation by three!

Let’s review another case when the goroutines use a larger stack, and therefore, are not suited for recycling:

func ping() {

for i := 0; i < 10; i++ {

var wg sync.WaitGroup

for g := 0; g < 10; g++ {

wg.Add(1)

go func() {

_, err := http.Get(`https://httpstat.us/200`)

if err != nil {

panic(err)

}

wg.Done()

}()

}

wg.Wait()

}

}

Here are the benchmarks with recycling:

With recycling Without recycling name time/op name time/op

PingUrl 12.8s ± 2% PingUrl 12.8s ± 3%

PingUrl-8 12.6s ± 0% PingUrl-8 12.7s ± 3%



name alloc/op name alloc/op

PingUrl 9.21MB ± 0% PingUrl 9.44MB ± 0%

PingUrl-8 9.28MB ± 0% PingUrl-8 9.43MB ± 0%

The impact here is quite low since the goroutines got a larger stack. Depending on the nature of your program, it could bring a huge advantage.