Featured image of post Why Are 0.5-Core Pods on k8s So Busy?

Why Are 0.5-Core Pods on k8s So Busy?

Understanding CPU resource constraints and Golang scheduler behavior in Kubernetes pods.

Context

Our CTO mentioned during a code review that our services run on 0.5-core Kubernetes pods and advised adjusting runtime parameters. Here’s what happened next:

  1. Observation: Our Go services run on pods with 0.5 CPU cores but reported GOMAXPROCS=8 (matching the host machine’s 8-core configuration).
  2. Problem: This mismatch caused excessive CPU utilization due to thread contention.
  3. Solution: Using Uber’s automaxprocs to align GOMAXPROCS with the pod’s actual CPU limit resolved the issue.


Why Does GOMAXPROCS Matter in Containers?

The GMP Scheduler in Go

  • G (Goroutine): Lightweight thread managed by Go runtime.
  • M (Machine): OS-level thread.
  • P (Processor): Virtual resource determining concurrency capacity.

When GOMAXPROCS is set higher than the actual CPU cores (e.g., 8 in a 0.5-core pod):

  • Go creates 8 Ps, each competing for limited CPU time.
  • Frequent thread switching (context switching) wastes CPU cycles.


Q: If GOMAXPROCS=1, Can Go Still Handle High Concurrency?

Yes! Here’s why:

  1. Network I/O Efficiency:

    • Go uses epoll (Linux), kqueue (macOS), or iocp (Windows) for non-blocking I/O.
    • Network operations don’t occupy CPU time during waits.
  2. Handling Blocking System Calls:

    • When a goroutine makes a blocking syscall (e.g., file I/O):
      1. The associated M (thread) detaches from P.
      2. A new M takes over the P to execute other goroutines.
      3. After the syscall completes, the original goroutine rejoins the scheduler queue.


Key Takeaways

  1. Always set GOMAXPROCS to match container CPU limits (use automaxprocs for Kubernetes).
  2. Go excels at I/O-bound tasks: Even with limited cores, non-blocking I/O allows high concurrency.
  3. Avoid overscheduling: Excess Ps in CPU-constrained environments lead to context-switching overhead.

References