Kubernetes — Hybrid, Multicloud, Multicluster Management — Anthos

Anthos — Multi-cluster Management

Gokul Chandra
ITNEXT
Published in
12 min readJun 12, 2020

--

Anthos is a portfolio of products and services for hybrid cloud and workload management that runs on the Google Kubernetes Engine (GKE) and apart from the Google Cloud Platform (GCP), users can manage workloads running on third-party clouds like AWS, Azure, Oracle etc. and on-premises (private) clusters. This simply means now users can enjoy the cloud they like for their application deployment and management needs. Admins and developers don’t need to learn all the new APIs and environments functionalities that come with a new cloud but only Google’s.

The consistency of the architecture across the public and on-prem clouds simplifies the deployment, and more importantly, customers do not need to do any additional work to implement the hybrid cloud. Soon as the Anthos on-prem cluster is deployed and connected to the Google compute platform, the hybrid cloud is ready for operation. As of now GKE on-prem runs as a virtual appliance on top of VMware vSphere. The support for other hypervisors such as Hyper-V and KVM is in works, but Anthos connect facilitates connecting any Kubernetes cluster irrespective of where it is located and how it is created.

Anthos multi-cloud architecture consists of a set of core software components that run on the Google cloud platform (GCP). Anthos supports multiple deployment targets where users can provision and manage GKE clusters on different cloud platforms such as AWS, Azure, GCP, GCP on-prem etc. these are technically called managed clusters where Anthos owns the lifecycle of GKE clusters launched through the control plane.

Anthos — Components

Google cloud platform console provides a single control plane for managing Kubernetes clusters deployed in multiple locations — on the Google public cloud, on the on-prem data center, or other cloud provide such as AWS, Azure etc., users can use this multi-cluster management capabilities to manage diverse/heterogenous Kubernetes clusters running in different environments and irrespective of how the Kubernetes cluster is deployed. Together, the components provide all the services required to orchestrate container workloads across the different clouds as well as provide the common policy framework, centralized configuration management, security, service management, and access to the container marketplace.

Anthos — Meta Control Plane

With introduction of Anthos, Google cloud platform also enabled managing Kubernetes clusters running anywhere — on Google cloud, on-prem data centers, or other cloud providers such as Amazon AWS from a single place. In addition, the multi-cluster capability also enables configuration management across cloud and on-prem environments as well as workloads running in different environments.

The core components of the multi-cluster management are GKE connection hub (Connect for short), Google cloud platform console, and the Anthos Config management.

Google Cloud Connect

Connect allows users to connect the on-prem Kubernetes clusters as well as Kubernetes clusters running on other public clouds with the Google cloud platform. Connect uses an encrypted connection between the Kubernetes clusters and the Google cloud platform project and enables authorized users to login to clusters, access details about their resources, projects, and clusters, and to manage cluster infrastructure and workloads whether they are running on Google’s hardware or elsewhere.

When remote clusters are registered, a GKE Connect Agent is deployed to the cluster which manages connectivity to various API endpoints on GCP. The cluster doesn’t require a public IP and just needs reachability to a set of googleapis. Connect agent uses an authenticated and encrypted connection from the Kubernetes cluster to GCP. Connect agent uses VPC Service Controls to ensure that GCP is an extension of users private cloud and can traverse NATs and firewalls. All user interactions with clusters are visible in Kubernetes audit logs.

Connect Agent and GCP Connectivity

All required APIs can be enabled in the respective project and users can derive detailed metrics on API interactions from GCP console:

API’s for enabling Anthos — Hub & Connect
API’s for enabling Anthos — Hub & Connect

Registering an External Kubernetes Cluster to GKE Connect

Cluster registration requires Kubeconfig of the target cluster and gcloud command line along with the required RBAC to enable GKE to access the cluster.

Cluster registration can be done using command line (gcloud cli) or using the GKE console as shown below:

Cluster Registration

The cluster registration deploys a connect agent deployment along with required secrets to access GCP-Project and Connection hub derived from the GCP service-account provided with the registration request.

GKE Connect Agent on Member Cluster
GKE Connect Agent Deployment on Member Cluster

All objects associated to GKE connect agent:

GKE Connect Agent — Objects

Secrets generated with the registration to enable GCP communication:

GKE Connect Agent — Secrets
GKE Connect Agent — Secrets

Once the authentication to GCP is established the connect agent opens a tunnel to connect required API’s:

GKE Connect Agent — Logs

Membership state on the cluster is stored as ‘Kind:Membership’ which contains the gkehub connection information:

Membership Custom Resource

All GKE connect agent metrics can be exported to Stackdriver using prometheus-to-sd and can be viewed from monitoring page of the console. A prometheus-to-sd pod is deployed which collects the metrics from at agent and push them to Stackdriver:

Prometheus — Stackdriver

Metrics explorer:

Google Cloud — Metrics Explorer
Google Cloud — Metrics Explorer

Once the cluster registration completes the clusters can be viewed from ‘Anthos’ section or GKE section of the console:

Anthos Console — Registered Clusters
Anthos Console — Registered Clusters

gcloud command-line can be used to update, list and describe the memberships:

Command-line — Registered Clusters

Users can login to the registered clusters using a ‘ClusterRole’ and an associated secret. Users can use a read-only role or cluster-admin role based on the required management capabilities.

Accessing registered clusters
Service Account — Token

Users can use one of the three authentication processes to login to the registered cluster:

Accessing registered clusters

Once user logs-in to the cluster, the information of the cluster is accessible:

Anthos — Console — Registered Cluster Information

Viewing and Managing Cluster Resources

All registered and logged-in cluster information can be accessed from ‘Kubernetes Engine’ section of the console.

Kubernetes Engine — Viewing Registered Cluster Resources

All workloads can be accessed from the portal and in-case of multiple clusters the filters enable to sort workloads and other resources.

Kubernetes Engine — Viewing Registered Cluster Resources
Kubernetes Engine — Viewing Registered Cluster Resources

Any resource can be edited from the console and the connect agent will perform the changes on the member clusters.

Kubernetes Engine — Editing Registered Cluster Resources

All resource objects of registered clusters can be accessed from object browser:

Kubernetes Engine — Object Browser

Tryout — Registering EKS, AKS and On-prem Clusters

EKS and AKS clusters are deployed on individual user accounts. The clusters are registered using the registration process discussed above. A connect agent is deployed on the clusters upon registration.

EKS cluster on AWS:

EKS Cluster — AWS
EKS Cluster — AWS

AKS cluster on Azure:

AKS Cluster
GKE Connect Agent on AKS Cluster
AKS Cluster

As shown below all three Kubernetes clusters are registered and labelled as ‘type:External’:

Multiple Clusters (EKS, AKS and On-prem) — Registered Clusters

Deploying Applications on External Clusters from Google Marketplace

The GCP container marketplace provides access to a large ecosystem of open source and commercial container application images that can be deployed on the GKE clusters running anywhere registered to GCP using connect. With the marketplace, customers can utilize pre-created containers for common applications such as databases or web servers without having to create them on their own. A catalogue of applications supported on Anthos are marked with ‘works with Anthos’ which enables users to configure and choose existing clusters for deploying the applications.

A sample Nginx application which is supported on Anthos is chosen below:

Sample Nginx Deployment — Google Cloud Market Place — Supported on Anthos

The cluster targeted for application deployment needs access to ‘gcr.io’ image-registry to pull container images. This can be facilitated by creating a service-account in the project and providing the same as ‘imagePullSecrets’ in Kubernetes:

Google Container Registry — Access

As shown below the deployment catalog includes cluster selection field where all registered clusters are listed. Clusters are segregated into eligible and non-eligible based on the level of permissions used for logging in to the cluster. For example, the on-prem cluster has insufficient permissions as the login used is using a read-only role.

Sample Nginx Deployment — Cluster Selection

Selection of deployment namespace:

Sample Nginx Deployment — Namespace Selection

Metrics can be enabled by choosing the specific parameter as shown below, selection of parameter injects a sidecar prometheus-to-sd container.

Sample Nginx Deployment — Google Cloud Market Place

Deployment steps can be viewed from the console:

Sample Nginx Deployment — Google Cloud Market Place

Connect agent on the eks-cluster receives the information on the application deployment:

GKE Connect Agent — Logs

A nginx-deployer container which runs required job to is deploy the application on the cluster:

Nginx Deployment Deployer — Google Cloud Market Place
Nginx Deployment Deployer— Google Cloud Market Place

Once the application deployment completes successfully the same can be viewed and managed from ‘Applications’ dashboard.

Sample Nginx Deployment — Google Cloud Market Place
Sample Nginx Deployment — Google Cloud Market Place

As shown below nginx application is deployed and the storage section shows pvc’s which are backed by ‘ebs’ on AWS.

Persistent Volumes — EKS EBS

Anthos Config Management — GitOps

This is one of the core components of Anthos. When deploying and managing GKE clusters in multiple locations, it becomes difficult to keep all clusters in sync with respect to their configuration, security policies (RBAC), resource configurations, namespaces, and so forth. As people start using these clusters and start making configuration changes, over time users will run into “configuration drift”, which results in different clusters behaving differently when the same application is deployed in different places. The Anthos Config management using a configuration-as-code approach enables centralized configuration management via descriptive templates maintained as code in a repository. This makes it easy to ensure that you see consistent behavior across the clusters and any deviations can be easily rectified by reverting the changes to the last known good state.

As shown below ACM-Operator is installed on all three clusters:

Anthos Config Management

A sample repo with configuration containing manifests of multiple namespaces and some cluster roles:

Git Repo — Source for Config

Config-management-operator is deployed on the cluster along with a CRD — configmanagement:

Config Management Operator — Logs
ConfigManagement — Custom Resources

A ConfigManagement object is created which contains syncRepo: Information of repo with configuration, syncBranch: Git Branch, secretType: If the repo is world-readable then the same can be set to none, if not the same then the type can be SSH, Key etc., policyDir: a subdirectory with all configs.

Config Management — Specification/Configuration

Once the above configuration is created the operator serves the request as shown below:

Config Management Operator Logs

Three deployments are created to import, sync and monitor the workloads deployed through ACM.

Objects — Anthos Config Management
Objects — Anthos Config Management

EKS cluster with git-importer, monitor and syncer — a view from GCP console:

Objects — Anthos Config Management

git-importer imports all the configuration from the specific directory and git-repo specified:

ACM — Git Importer Component

syncer periodically compares the object state in the cluster with the git configuration and maintains a sync with the git source:

ACM — Syncer Component

As shown below all objects are created on all three clusters with configuration from the git-repo and will be synced to match the state. Users cannot edit the objects created by ACM, as the changes will be reverted after a periodic sync from the repo unless specific annotations are deleted from the managed object (discussed below).

Objects Created — Anthos Config Management

Optionally users can use ‘nomos’ command-line tool that can be installed locally to interact with the Config Management Operator and can check the syntax of configs before they are you committed the repo.

Nomos CLI— Config Management Operator

An object in a cluster is managed by Anthos Config Management if it has the annotation ‘configmanagement.gke.io/managed: enabled’. If an object does not have the label ‘configmanagement.gke.io/managed’ at all, or if it is set to anything other than enabled, the object is unmanaged.

When Anthos Config Management manages a cluster object, it watches the object and the set of configs in the repo that affect the object, and ensures that they are in sync. Users can manage an existing object and can also stop managing an object that is currently managed without deleting the object.

The following flow chart describes some situations that cause an object to become managed or unmanaged:

Managed and Unmanaged Objects

Google has evolved Anthos to modernize applications by containerizing legacy applications. It has enhanced the overall security of on-premise as well as cloud infrastructures. Anthos is fully equipped to manage hybrid clouds. Users can even combine Anthos with other Kubernetes management tools and platforms. For example, if user is using Kubernetes clusters deployed using home-grown automation using Kubeadm, and that means they are using open source Kubernetes (or) hosting their Kubernetes workloads on EKS or AKS, users can still register that cluster with Anthos and Anthos will work at a higher level, allowing to do workload distribution, CSM, policy distribution and multicluster management.

--

--