Multi-cluster Kubernetes Management Solutions
An overview of popular SaaS solutions including Rancher, Google Anthos, Azure Arc, and Volterra as well as open-source alternatives.
As more organizations migrate their infrastructure to Kubernetes, the question is no longer just “How do I manage all of my applications on a single Kubernetes cluster?” Now more cluster administrators are grappling with how to manage multi-clusters within their organization. While Kubernetes supports namespaces for soft-isolation and virtual clusters for hard multi-tenancy within a single cluster, running multiple clusters may sometimes be required.
The most common reasons for running multiple clusters include:
- Strict isolation: this may be driven by compliance (e.g. separating dev/qa/staging cluster with production) or by customer need (e.g. running a dedicated service based on customer request)
- Multi-region: the application may need to run in multiple regions for availability, failover, latency, or locality (e.g. data protection laws) reasons
- Multi-cloud: similarly, the application may need to run on multiple clouds for availability/disaster recovery protocol or to avoid vendor lock-in
- Scale: on rare occasions, the service may hit the limits of the Kubernetes in terms of scale (e.g. 150k pods in a cluster for GKE)
To a certain degree, managing multiple clusters can be achieved with a good CI/CD pipeline. For example, in Kubernetes CI/CD with CircleCI and ArgoCD, I introduced leveraging apps of apps pattern to bootstrap a single cluster. That can be extended to pipelines deploying to different clusters to simplify the workflow. Some examples are linked below:
However, a true multi-cluster management solution requires more than application deployment. There are security considerations (e.g. RBAC, multi-cluster logging), config/secret management, as well as feature parity across different Kubernetes clusters. In this post, we’ll look at some popular hosted solutions as well as open-source projects to manage multiple Kubernetes clusters.
Rancher

Before Google announced Anthos and AWS introduced EKS Anywhere, Rancher was one of the few enterprise-ready options along with OpenShift and Cloud Foundry to support running Kubernetes on hybrid and multi-cloud infrastructures. Rancher provides a single control plane to create or add existing Kubernetes clusters.
Multi-cluster apps have been supported since Rancher v2.2.0, and a newer project called Fleet (multi-cluster app deployment based on GitOps principles) is now available as of Rancher v2.5. Applications distributed across multiple clusters can also benefit from Rancher’s Global DNS to configure load balancing across apps without having to use an external solution like Cloudflare.

All of Rancher Lab’s projects, including Rancher, RKE, Longhorn, and K3s, are open-source, but they also provide a hosted version if you need a SaaS solution.
Google Anthos
When Thomas Kurian took over Google Cloud, he carried over a vision for a multi-cloud strategy from Oracle, which I detailed in “Why BigQuery Omni is a Big Deal.” Google Anthos, a Kubernetes-based, open platform to extend GKE to hybrid and multi-cloud environments, fits this strategy perfectly. It’s no secret that Google is a leader in Kubernetes, and for existing GKE users and new organizations looking to adopt Kubernetes, Anthos is an enticing option.

Anthos organizes clusters into environs, a logical grouping of clusters and resources (e.g. workload identities, namespaces, services) that can be managed together. Another powerful feature of Anthos is Anthos Config Management, which is comprised of the following components:
- Config Sync: follows the GitOps model to continuously sync configuration across multiple clusters (e.g. namespaces, cluster roles, security policies)
- Policy Controller: based on the Open Policy Agent Gatekeeper project to enforce policies (e.g. non-compliant API requests)
- Binary Authorization: requires that images running in the cluster are signed by trusted authorities
- Hierarchy Controller: based on the Hierarchical Namespace Controller project to create namespaces that share a common parent namespace for inheritance or delegation of control
Microsoft Azure Arc

For Microsoft Azure users, Azure Arc provides similar capabilities as Google Anthos. Azure Lighthouse is used to control RBAC across all Kubernetes clusters, along with Azure Policy to enforce and evaluate policy violations in realtime. Azure Arc also natively supports some Azure data services such as Azure SQL Managed Instance and Azure PostgresSQL Hyperscale.
Volterra
A few weeks ago, F5 announced their acquisition of Volterra, a multi/hybrid-cloud management startup that emerged in 2019 with investments from Khosla Ventures, Mayfield, M12 (Microsoft), and Samsung Ventures. In “Control Plane for Distributed Kubernetes PaaS,” Volterra CEO, Ankur Singla, described the motivation for founding Volterra:
- No robust Kubernetes distribution or PaaS (e.g. OpenShift, Cloud Foundry) existed that provided a comprehensive set of security and operational services for distributed clusters such as RBAC/user-management, secrets/key management, and multi-cluster service mesh.
- Anthos, Azure Arc, and Rancher focused on packaging and deploying multiple services to distributed clusters and less on the operational and multi-tenancy requirements.
Since 2019, other competitors have added some features to addresses these concerns, but at the time, Volterra’s suite of SaaS products provided a unique solution to deploy, connect, secure, and operate apps across multiple cloud and edge platforms:
- VoltConsole: portal to centrally manage apps running on VoltMesh and VoltStack
- VoltStack: distributed Kubernetes platform to deploy, secure, and operate apps running across clouds and the edge
- VoltMesh: high-performance networking layer to connect apps running on different clouds and the edge
- Volterra Global Network: app-to-app network for extreme performance

Other Solutions
- Rafay Managed Kubernetes Platform: a solution from Rafay (notable customer: Verizon)
- VMWare Tanzu Mission Control: central management solution from VMWare
- IBM Cloud Pak: multicluster management solution from IBM
- Admirality: open-source or cloud-hosted/enterprise version available
- Gopaddle: application-centric platform to with centralized cluster management support
Open Source Projects
- kubefed: official Kubernetes SIG project in alpha
- gardener: multi-cloud/cluster project with strong support from SAP
- KQueen: an old project from Mirantis seems to be no longer maintained
If I missed any other solutions in the market, please leave a comment, and I will update accordingly.
