K8s Monitor Pod CPU and memory usage with Prometheus
Find out how much resources your Kubernetes Pods actually use and visualise CPU throttling.

Parts
- Manually monitor pod resources (this article)
- Automatically set pod resources with Vertical Pod Autoscaling
Related
- Practical Guide to Kubernetes Horizontal Pod Autoscaling
- Practical Guide to Kubernetes Node Autoscaling
What will we do here?
We’ll walk through steps necessary to monitor how much resources (CPU or memory) a Kubernetes pod is using. Hence we’ll look at:
- CPU requests / limits / actual usage / throttling
- Memory requests / limits / actual usage / termination
We do this using metrics-server, Grafana and Prometheus.
Resource requests and limits
What are these?
This great blog post and video will get you up to date.
Why do we need these?
When you create a new application or migrate an existing one into Kubernetes, you might not know how much resources it needs. Though Kubernetes works best if every pod (more precisely every container in every pod) has resource limits and requests defined. This controls pod scheduling on nodes.
Having requests+limits defined will result in the more effective usage of all available resources inside the cluster.
Unwanted CPU throttling BUG
Setting any CPU limits can cause unwanted CPU throttling, even if the usage doesn’t reach its limits. Read more about this my article Kubernetes Resource Management in Production.
Test Repo / App
You can follow along and test this yourself using this repo: https://github.com/wuestkamp/k8s-example-resource-monitoring
Image
The application consists of a simple deployment using the image gcr.io/kubernetes-e2e-test-images/resource-consumer:1.5. It provides an HTTP endpoint and can receive commands to use resources:
curl --data "millicores=400&durationSec=600" 10.12.0.11:8080/ConsumeCPUcurl --data "megabytes=300&durationSec=600" 10.12.0.11:8080/ConsumeMemThis allows us to manipulate the CPU and memory usage in a running pod (more here).
View Grafana Dashboard
The test app comes with Grafana+Prometheus installed and configured. There is also an existing dashboard (i/grafana/dashboard.json) which shows the CPU and memory data.

Kubernetes 1.16 changed metrics
Removed cadvisor metric labels
pod_nameandcontainer_nameto match instrumentation guidelines. Any Prometheus queries that matchpod_nameandcontainer_namelabels (e.g. cadvisor or kubelet probe metrics) must be updated to usepodandcontainerinstead. (source)
If you’re using Kubernetes 1.16 and above you’ll have to use pod instead of pod_name and container instead of container_name.
CPU
We use the following Prometheus queries:
# metrics are for k8s till 1.15
# for >=1.16 use pod instead of pod_name and container instead container_name# container usage
rate(container_cpu_usage_seconds_total{pod=~"compute-.*", image!="", container_name!="POD"}[5m])
# container requests
avg(kube_pod_container_resource_requests_cpu_cores{pod=~"compute-.*"})
# container limits
avg(kube_pod_container_resource_limits_cpu_cores{pod=~"compute-.*"})
# throttling
rate(container_cpu_cfs_throttled_seconds_total{pod=~"compute-.*", container_name!="POD", image!=""}[5m])Regarding the units (more):
- 500m = 500millicore = 0.5 core
- 500m = 500 millicpu = 0.5 cpu
No usage

The image above shows the pod requests of 500m (green) and limits of 700m (yellow). It also shows that the pod currently is not using any CPU (blue) and hence nothing is throttled (red).
Usage in the limit range
We now raise the CPU usage of our pod to 600m:

The image above shows the CPU usage (blue) is rising up to 600m. We’re below the defined limitation (yellow) and hence see no throttling (red).
Usage above limits

The image above shows the pod’s container now tries to use 1000m (blue) but this is limited to 700m (yellow). Because of the limits we see throttling going on (red). The pod uses 700m and is throttled by 300m which sums up to the 1000m it tries to use.

If we reduce the pod’s CPU usage down to 500m (blue), same value as the requests (green), we see that throttling (red) is down to 0 again.
We want to avoid CPU throttling for optimal efficiency.
Memory
We use the following Prometheus queries:
# metrics are for k8s till 1.15
# for >=1.16 use pod instead of pod_name and container instead container_name# container usage
container_memory_working_set_bytes{pod_name=~"compute-.*", image!="", container_name!="POD"}
# container requests
avg(kube_pod_container_resource_requests_memory_bytes{pod=~"compute-.*"})
# container limits
avg(kube_pod_container_resource_limits_memory_bytes{pod=~"compute-.*"})No usage

In above’s image we see the pod requests (green) 250Mi, limits (yellow) 500Mi and uses (blue) 0 memory.
Usage in the limit range
We now raise the memory usage to values lower the defined limit:


Usage above the limit

When trying to allocate more memory than set as limit, Kubernetes kills the process (signal 9) that causes this. If the container is running that process as entrypoint, the container will be restarted. In the image above, the main process isn’t killed but just a subprocess. Hence we see no container restarts but the memory usage dropping to 0.
This causes a warning event (kubectl get events):
default 22s Warning OOMKilling node/gke-resources-test-default-pool-6cad87bd-bgf4 Memory cgroup out of memory: Kill process 134119 (stress) score 1962 or sacrifice childKilled process 134119 (stress) total-vm:519288kB, anon-rss:508260kB, file-rss:268kB, shmem-rss:0kBTo import these k8s events into Prometheus/Grafana, to for example set up Prometheus alerting, the event_exporter can be used.
Read more
https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource
https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource
What's next?
Next, I’ll read into the concept of vertically scaling pod resources, how it works with Kubernetes and how to visualise it using Grafana+Prometheus.
In the end, I guess it would be great to have the cluster automatically adjusting requests+limits based on the application needs. At least as far as it’s possible.
Become Kubernetes Certified






