Don't Build Workers on Top of Goroutines

Why creating worker pools is unnecessary when Go’s runtime already handles concurrency efficiently

  ·   7 min read

Introduction #

I love reading source code. When looking at concurrent code, I often see people building worker systems on top of goroutines, using either channels or arrays for the queue part. But if we look at Go’s concurrency model, it’s already a publisher-queue-worker model. We developers act as publishers when creating goroutines, while OS threads process these goroutines from a queue.

What makes Go attractive is that it solves these problems at the runtime level and gives us an easy interface to write concurrent code. Rebuilding this pattern at the application level is problematic for several reasons.

Real-World Example #

Here’s a real-world example from a prominent repository that demonstrates this pattern of building worker abstractions on top of goroutines:

package iter

import (
    "runtime"
    "sync/atomic"

    "github.com/sourcegraph/conc"
)

// defaultMaxGoroutines returns the default maximum number of
// goroutines to use within this package.
func defaultMaxGoroutines() int { return runtime.GOMAXPROCS(0) }

// Iterator can be used to configure the behaviour of ForEach
// and ForEachIdx. The zero value is safe to use with reasonable
// defaults.
//
// Iterator is also safe for reuse and concurrent use.
type Iterator[T any] struct {
    // MaxGoroutines controls the maximum number of goroutines
    // to use on this Iterator's methods.
    //
    // If unset, MaxGoroutines defaults to runtime.GOMAXPROCS(0).
    MaxGoroutines int
}

// ForEach executes f in parallel over each element in input.
//
// It is safe to mutate the input parameter, which makes it
// possible to map in place.
//
// ForEach always uses at most runtime.GOMAXPROCS goroutines.
// It takes roughly 2µs to start up the goroutines and adds
// an overhead of roughly 50ns per element of input. For
// a configurable goroutine limit, use a custom Iterator.
func ForEach[T any](input []T, f func(*T)) { Iterator[T]{}.ForEach(input, f) }

// ForEach executes f in parallel over each element in input,
// using up to the Iterator's configured maximum number of
// goroutines.
//
// It is safe to mutate the input parameter, which makes it
// possible to map in place.
//
// It takes roughly 2µs to start up the goroutines and adds
// an overhead of roughly 50ns per element of input.
func (iter Iterator[T]) ForEach(input []T, f func(*T)) {
    iter.ForEachIdx(input, func(_ int, t *T) {
       f(t)
    })
}

// ForEachIdx is the same as ForEach except it also provides the
// index of the element to the callback.
func ForEachIdx[T any](input []T, f func(int, *T)) { Iterator[T]{}.ForEachIdx(input, f) }

// ForEachIdx is the same as ForEach except it also provides the
// index of the element to the callback.
func (iter Iterator[T]) ForEachIdx(input []T, f func(int, *T)) {
    if iter.MaxGoroutines == 0 {
       // iter is a value receiver and is hence safe to mutate
       iter.MaxGoroutines = defaultMaxGoroutines()
    }

    numInput := len(input)
    if iter.MaxGoroutines > numInput {
       // No more concurrent tasks than the number of input items.
       iter.MaxGoroutines = numInput
    }

    var idx atomic.Int64
    // Create the task outside the loop to avoid extra closure allocations.
    task := func() {
       i := int(idx.Add(1) - 1)
       for ; i < numInput; i = int(idx.Add(1) - 1) {
          f(i, &input[i])
       }
    }

    var wg conc.WaitGroup
    for i := 0; i < iter.MaxGoroutines; i++ {
       wg.Go(task)
    }
    wg.Wait()
}

Look at all that code! It’s trying to solve a problem that Go’s runtime already handles efficiently. This implementation limits goroutines to GOMAXPROCS and creates a worker pool that processes items in a slice. But Go’s scheduler already does this work for you.

Why It’s Unnecessary #

There’s no need to solve a problem that’s already been solved elegantly at the platform level. Go’s concurrency model is specifically designed to handle the scheduling of goroutines efficiently.

Performance Pitfalls #

Go’s runtime is sophisticated and cleverly designed. Recreating this structure at the application level will almost certainly result in a less efficient implementation. The Go scheduler is highly optimized for efficiently managing thousands of goroutines across available CPU resources.

Maintenance Headaches #

The worker system we write needs ongoing maintenance because it’s now our code, and it invites many bugs. Creating a structure free from deadlocks and race conditions isn’t as easy as it seems. The Go team has already put significant effort into making the runtime scheduler robust.

Extra Mental Burden #

Anyone new to the codebase now has to program against our custom abstraction instead of Go’s runtime. This increases the cognitive load and makes the code harder to understand and maintain.

Drifting Away from the Problem Space #

Go encourages us to stay focused on the actual problem by modeling independent jobs as goroutines. Having workers in our code prevents us from correctly modeling the problem space in our code. The codebase becomes riddled with code that isn’t related to solving the actual problem.

A Better Approach #

Here’s a cleaner approach to handling concurrent operations:

func main() {
    // Let's imagine we need to send emails to multiple customers
    customers := []struct {
        ID    int
        Email string
        Name  string
        Type  string
    }{
        {1, "customer1@example.com", "Alex Johnson", "Premium"},
        {2, "customer2@example.com", "Sam Smith", "Regular"},
        {3, "customer3@example.com", "Taylor Brown", "Premium"},
        {4, "customer4@example.com", "Jordan Lee", "Trial"},
        {5, "customer5@example.com", "Casey Miller", "Regular"},
    }
    
    var wg sync.WaitGroup
    
    // Launch a goroutine for each email that needs to be sent
    for _, customer := range customers {
        wg.Add(1)
        // Simply launch the goroutine with our function
        go sendCustomerEmail(customer, &wg)
    }
    
    // Wait for all emails to be sent
    wg.Wait()
    fmt.Println("All emails have been sent!")
}

// A regular function that sends an email to a customer
func sendCustomerEmail(c struct {
    ID    int
    Email string
    Name  string
    Type  string
}, wg *sync.WaitGroup) {
    defer wg.Done()
    
    // Choose email template based on customer type
    var template string
    switch c.Type {
    case "Premium":
        template = "premium_newsletter.html"
    case "Trial":
        template = "trial_extension_offer.html"
    default:
        template = "regular_newsletter.html"
    }
    
    // Log the start of sending
    fmt.Printf("Sending email to %s (%s)\n", c.Name, c.Email)
    
    // Simulate the actual email sending
    time.Sleep(time.Millisecond * 300) // Network/API delay
    
    // In a real app, we might have code like:
    // emailContent := generateEmailContent(template, c.Name)
    // response, err := emailClient.Send(c.Email, "March Newsletter", emailContent)
    
    // Log the completion
    fmt.Printf("✓ Email sent to %s successfully\n", c.Name)
}

This code is simpler, more readable, and takes full advantage of Go’s concurrency model without the unnecessary worker abstraction.

Actionable Takeaways #

  1. Trust Go’s runtime: The Go scheduler is highly optimized. Trust it to manage your goroutines efficiently.

  2. Think in terms of concurrent tasks: Model your problem around independent units of work rather than worker pools.

  3. Use goroutines liberally: Don’t be afraid to spawn a goroutine for each task - they’re designed to be lightweight.

  4. Focus on your problem domain: Spend your time solving business problems rather than rebuilding concurrency abstractions.

  5. Profile before optimizing: If you’re concerned about performance, profile your application first to identify real bottlenecks before adding complexity.

Reflections #

In the end, Go encourages us to create a goroutine for each separate piece of work without worry. Goroutines are lightweight and won’t overload your machine because the number running in parallel is limited by your thread count. For example, if you create 10,000 goroutines on a computer with 4 logical threads, only 4 will run in parallel. They can make progress concurrently (don’t mix up parallelism and concurrency).

In fact, creating goroutines freely is the recommended practice because it allows threads to handle other tasks without wasting cycles when facing I/O-bound operations. Creating many goroutines enables threads to make progress on all tasks concurrently when I/O-bound operations occur, rather than wasting time. This helps you get the most out of your CPU.

Remember: Go’s philosophy is about simplicity. The runtime already provides an elegant solution for concurrent work - embrace it!