Kubernetes Resource Limits: The Balancing Act Between Predictability and Efficiency
ayedo Redaktion 3 Minuten Lesezeit

Kubernetes Resource Limits: The Balancing Act Between Predictability and Efficiency

Discover why resource limits in Kubernetes are crucial for stable applications—even if they sometimes seem obstructive.
kubernetes kubernetes-news

There is a lot of discussion about whether not using Kubernetes resource limits could actually be beneficial (for example, in articles like For the Love of God, Stop Using CPU Limits on Kubernetes or Kubernetes: Make your services faster by removing CPU limits). The arguments are certainly valid—it makes little sense to pay for computing power that goes unused due to restrictions or to artificially increase latency. However, this article aims to demonstrate that limits have their legitimate advantages.

As a Site Reliability Engineer on the team at Grafana Labs, which maintains and improves the internal infrastructure and tools for the product teams, my primary goal is to make Kubernetes updates as smooth as possible. But I also spend a lot of time dealing with various interesting Kubernetes issues. This article reflects my personal opinion, and others in the community might disagree.

Let’s look at the issue from the other side. Every pod in a Kubernetes cluster has inherent resource limits—the actual CPU, memory, and other resources of the machine it runs on. If these physical limits are exceeded by a pod, it experiences throttling, similar to reaching Kubernetes limits.

The Problem

Pods without (or with generous) limits can easily consume the additional resources on the node. However, this comes at a hidden cost—the amount of available resources often heavily depends on the pods scheduled on that particular node and their actual usage. These additional resources make every pod a special case when it comes to actual resource allocation. Worse still, it is quite difficult to determine the resources available to a pod at any given time—certainly not without cumbersome data mining of pods running on a specific node, their resource usage, and the like. And even if we overcome this hurdle, we can only capture data to a certain extent and obtain profiles for only a portion of our calls. While this can be scaled, the amount of observational data generated could easily lead to diminishing returns. Therefore, there is no easy way to recognize if a pod had a sudden spike and used twice as much memory as usual for a short time to handle a burst of requests.

Now, with Black Friday and Cyber Monday around the corner, companies expect a surge in traffic. Good performance data and benchmarks of past performance enable companies to plan for additional capacity. But are data on pods without limits reliable? With instant spikes in memory or CPU buffered by the additional resources, everything might look fine based on past data. Yet, once the pod bin-packing changes and the additional resources become scarce, everything could look different—from barely noticeable increases in request latency to slowly rising requests leading to OOM kills of the pods. While almost no one cares about the former issue, the latter is a serious problem requiring an immediate capacity increase.

The discussion about resource limits in Kubernetes is important to ensure both predictability and efficiency in your infrastructure. If you are looking for support in implementing Kubernetes resource limits, ayedo is your competent partner.


Source: Kubernetes Blog

Ähnliche Artikel