How to Massively Reduce Prometheus Load and Cardinality by Only Using Istio Labels You Need
If you have been exposed to operating Prometheus, you have likely heard that managing cardinality is critical. It is one of the most impactful aspects of an observability configuration, and it can drastically increase system load. Robust Perception has a great explainer on why it matters, so I won’t go into the why on why this is important, but I will explain one way to tune your metrics so that you can wrangle some of your most expensive metrics.
Specifically, I am going to talk about Istio. We are currently using v1.18, so please keep in mind that some of the fields I am referring to may look different in a prior version (or a new one, for that matter). I had a difficult time pinpointing examples from this version, so my hope is that these examples will be useful to you.
Onto the Fun Part
First and foremost, we’re talking about Istio (and by proxy, Kubernetes). The specific component that we are tuning, is the Istio Operator. If you’re not too familiar, the Istio Operator is a Kubernetes Controller that manages the istioOperator
custom resource. In turn, the Istio operator creates and continually reconciles the Istio resources on the cluster. That includes Gateway resources, Envoy Filters, Pilot (istiod), etc.
Istio documentation does reference the feature we’re working with, but in a single line that does not adequately provide examples on how to use it.
You can modify the standard metric definitions using
tags_to_remove
or by re-defining a dimension.
Ok, so we have an istioOperator
manifest that we need to configure. And we know we are making use of the tags_to_remove
option to override metric labels. Easy enough.
Note: When we are talking about updating Istio configurations, you want to ALWAYS test this in a non-production environment. Even though we are talking about metric labels — there can still be unintended impact. For example, if you are using custom metrics to define scaling behavior, you could very well break scaling behavior across an entire cluster. That is just one example, but keep that in mind.
Now, what is not straightforward, is how this configuration change looks in practice. So here it is. The configuration you want will be set for both inboundSidecar
and outboundSidecar
properties. The full path in the istiooperator
manifest looks as such:
spec.telemetry.v2.prometheus.configOverride.inboundSidecar.metrics
spec.telemetry.v2.prometheus.configOverride.outboundSidecar.metrics
A more complete example looks like this (note that this is simplified to only include necessary configurations for this story):
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
namespace: ${istio_namespace}
name: ${istio_name}
spec:
values:
telemetry:
enabled: true
v2:
prometheus:
wasmEnabled: true
configOverride:
inboundSidecar:
metrics:
- name: requests_total
tags_to_remove:
- connection_security_policy
- destination_cluster
- destination_canonical_revision
- destination_canonical_service
- destination_principal
- destination_version
- destination_service_name
- destination_service_namespace
- destination_workload_namespace
- source_canonical_service
- source_canonical_revision
- source_workload_namespace
- source_version
- source_cluster
- source_principal
- name: request_bytes
tags_to_remove:
- destination_cluster
- destination_canonical_revision
- destination_canonical_service
- destination_principal
- destination_version
- destination_service_name
- destination_service_namespace
- destination_workload_namespace
- source_canonical_service
- source_canonical_revision
- source_workload_namespace
- source_version
- source_cluster
- source_principal
- service
- pod
What Now?
So the Istio Operator configuration is updated. Great! But… what now?
Presumably you have access to a non-production environment, and you can apply this updated manifest to that Kubernetes cluster. You can tail the logs of your istio operator pod and see that it synchronizes the cluster resources. Specifically, it should update your Envoy Filters.
Assuming you don’t have any syntax issues, you would see something like this in the logs:
2024-03-18T17:54:21.303088Z info installer The following objects differ between generated manifest and cache:
- ConfigMap:istio-system:istio
- EnvoyFilter:istio-system:stats-filter-1.16
- EnvoyFilter:istio-system:tcp-stats-filter-1.16
- EnvoyFilter:istio-system:stats-filter-1.17
- EnvoyFilter:istio-system:tcp-stats-filter-1.17
- EnvoyFilter:istio-system:stats-filter-1.18
- EnvoyFilter:istio-system:tcp-stats-filter-1.18
2024-03-18T17:54:21.303181Z info installer using server side apply to update obj: EnvoyFilter/istio-system/stats-filter-1.16
- Pruning removed resources
2024-03-18T17:54:21.333325Z info installer using server side apply to update obj: EnvoyFilter/istio-system/stats-filter-1.17
2024-03-18T17:54:21.352880Z info installer using server side apply to update obj: EnvoyFilter/istio-system/stats-filter-1.18
2024-03-18T17:54:21.372182Z info installer using server side apply to update obj: EnvoyFilter/istio-system/tcp-stats-filter-1.16
2024-03-18T17:54:21.392004Z info installer using server side apply to update obj: EnvoyFilter/istio-system/tcp-stats-filter-1.17
2024-03-18T17:54:21.409610Z info installer using server side apply to update obj: EnvoyFilter/istio-system/tcp-stats-filter-1.18
2024-03-18T17:54:21.433381Z info installer using server side apply to update obj: ConfigMap/istio-system/istio
Out of an overabundance of caution on my part, I did take the opportunity to restart everything I had in the istio-system
namespace. That includes all my gateways and istiod. I don’t know if that’s totally necessary, but it’s something I did. Every cluster where I applied my changes, I shifted traffic off, made my changes, and conducted restarts.
One more thing on applying the changes: I observed in some of my namespaces that my metric label updates didn’t take effect immediately. It stands to reason that the istio proxy containers may need to restart before they actually take the changes to the envoy filters. I don’t know if that’s 100% true, but again — it was my observation in some of my namespaces. That doesn’t strike me as unusual, but you will want to be aware and validate your changes as you make them.
Parting Words and Examples
If your Kubernetes cluster facilitates a very large number of requests to services, the gains you will see from reducing cardinality will be significant. It takes time to properly vet your labels (check on prometheusRules
, dashboards, etc), but the exercise is worthwhile.
When we cut down our high-cardinality labels from this:
{
"container":"istio-proxy",
"destination_canonical_revision":"<dstUID>",
"destination_canonical_service":"<dstApp>",
"destination_cluster":"Kubernetes",
"destination_service":"<dstApp>.<dstApp>-production.svc.cluster.local",
"destination_service_name":"PassthroughCluster",
"destination_service_namespace":"<dstApp>-production",
"destination_version":"<dstUID>",
"destination_workload":"<dstUID>",
"destination_workload_namespace":"<dstApp>-production",
"instance":"<someIP>",
"job":"<appName>",
"le":"0.5",
"namespace":"<appName>-production",
"pod":"<srcUID>-kkd77",
"reporter":"source",
"request_protocol":"http",
"response_code":"200",
"response_flags":"-",
"service":"<appName>",
"source_app":"<appName>",
"source_canonical_revision":"<srcUID>",
"source_canonical_service":"<appName}",
"source_cluster":"Kubernetes",
"tag":"<srcUID>"
}
to this:
{
"reporter": "destination",
"source_workload": "<srcUID>",
"source_app": "<srcApp>",
"destination_workload": "<dstUID>",
"destination_app": "<dstApp>",
"destination_service": "<dstApp>.<dstApp>-production.svc.cluster.local",
"request_protocol": "http",
"job":"<appName>",
"response_code": "200",
"grpc_response_status": "",
"response_flags": "-",
"le": "1"
}
there was a massive drop in s3 storage costs, even after just a few days. We also saw increased performance on our Prometheus instances, and reduced resource usage on those. In other words — identifying the unnecessary labels and removing them from your per-request metrics is absolutely a worthwhile exercise.
I hope this walkthrough can help some of my fellow Prometheus Operators optimize their observability stack!