AI and Kubernetes: Open Source Tools powered by AI/OpenAI for Kubernetes.

Seifeddine Rajhi
ITNEXT
Published in
16 min readDec 4, 2023

--

Kubernetes & AI team-up

Introduction:

AI is generating a lot of buzz these days, and the Kubernetes-powered DevOps world is no exception. Software engineers are huge automation enthusiasts, so it’s natural that AI-powered tools for Kubernetes operators are starting to pop up.

Most of these tools are designed for terminal (CLI) usage. Kubernetes is the go-to platform for container orchestration, but its complexity can be daunting. AI-powered tools can help you master Kubernetes by automating tasks, improving reliability, and providing insights.

One of the most promising AI platforms for Kubernetes AIOps is OpenAI. OpenAI provides a suite of open-source tools that can be used to build AI-powered solutions for Kubernetes.

In this blog, we’ll explore how OpenAI’s open-source tools can help you achieve superpowers for Kubernetes AIOps.

Overview:

Kubernetes users inevitably encounter cluster issues that require debugging and fixing to keep their pods and services running smoothly. Whether you are new to Kubernetes or managing large and complex environments, debugging in-cluster processes can be challenging and time-consuming.

Cloud-native DevOps involves a lot of complexity in managing containers, microservices and autoscaling capabilities. Generative AI could help troubleshoot and perform some of the operational tasks associated with platforms like Kubernetes. This could involve using natural language prompts to spin up, roll back or get visibility into clusters.

For example, at KubeCon + CloudNativeCon 2023, Kubiya debuted a generative AI workflow engine that’s able to interpret such commands from within Slack. Extending natural language processing might aid platform teams in creating new workflows that abstract the complexity of working with cloud-native platforms.

✨ k8sgpt:

K8sGPT is a relatively new open source project that uses AI models to explain Kubernetes error messages and provide cluster insights. It is already used by two organizations in production and has applied to become a CNCF sandbox project.

K8sGPT Features:

  • Provides detailed and contextualized explanations of Kubernetes error messages
  • Provides cluster insights
  • Supports multiple installation options
  • Supports different AI backends

K8sgpt is the most well-known and widely used project of its kind. Launched by Alex Jones, it is marketed as a way to “give Kubernetes superpowers to everyone.”

K8sgpt is a CLI tool with a primary command, k8sgpt analyze, which is designed to reveal problems in your Kubernetes cluster. It does this by using "analyzers," which define the logic for each Kubernetes object and the possible problems it may be encountering. For example, an analyzer for Kubernetes Services will check whether a particular Service exists and has endpoints, and whether its endpoints are ready.

Identifying such an issue is not a big deal in and of itself, but the magic happens when you ask k8sgpt to explain what you can do about it. To do this, you run the command k8sgpt analyze --explain. This will ask the AI for instructions for your specific case and display them for you. The instructions will include the actions to take to troubleshoot the problem, including specific kubectl commands that you can execute by simply copying and pasting them. This is made possible by the fact that the names of your Kubernetes resources are already in place.

Installation:

There are several installation options available depending on your preferences and Operating system. You can find the different options in the installation section of the documentation.

The prerequisite for installing K8sGPT as mentioned below is to have Homebrew installed either on your Mac or WSL on your Windows machine.

Next, you can run the following commands:

brew tap k8sgpt-ai/k8sgpt
brew install k8sgpt

To view all the commands provided by K8sGPT, used the ` — help` flag:

k8sgpt --help

You can see an overview of the different commands also on the documentation.

Next, we will have to authorise an AI backend. In this video, we will use OpenAI.

Prerequisites

The prerequisites to follow the next sections is to have an OpneAI account and a running Kubernetes cluster; any cluster, such as microk8s or minikube will be sufficient.

Once you have the OpneAI account, you want to go to the following site to generate a new API key https://platform.openai.com/account/api-keys

Alternatively, you can run the following command and K8sGPT will open the same site in your default Browser:

k8sgpt generate

This key is needed for K8sGPT to interact with OpenAI. Authorise K8sGPT with the newly created API key/token:

k8sgpt auth add openai
Enter openai Key: openai added to the AI backend provider list

You can list your backends with the following command:

k8sgpt auth list
Default:
> openai
Active:
> openai
Unused:
> localai
> azureopenai
> noopai

Again, our documentation provides further information on the different AI backends available.

Next, we will install a malicious Deployment into our Kubernetes cluster. The pod will go into CrashLoopBackOff. Here is the YAML:

apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
securityContext:
readOnlyRootFilesystem: true

Next, we will create a dedicated namespace for our example application and install the Deployment:

kubectl create ns demo
namespace/demo created
kubectl apply -f ./deployment.yaml -n demo
deployment.apps/nginx-deployment created

Now you will see the pods in our demo namespace throwing errors:

If we look at the events for one of the pods, we will not become much smarter:

Warning  BackOff	3s (x8 over 87s)  kubelet        	Back-off restarting failed container

So what we can do instead to access more details on why these pods are erroring, we can run a K8sGPT command:

k8sgpt analyse

This will show us the problems that k8sGPT found inside our cluster:

AI Provider: openai

To receive further information as well as recommendations on how to fix the issues, we can use the --explain flag:

k8sgpt analyse --explain
0 demo/nginx-deployment-5f4c7db77b-hq74n(Deployment/nginx-deployment)
- Error: back-off 1m20s restarting failed container=nginx pod=nginx-deployment-5f4c7db77b-hq74n_demo(7854b793-21b7-4f81-86e5-dbb4113f64f4)
1 demo/nginx-deployment-5f4c7db77b-phbq8(Deployment/nginx-deployment)
- Error: back-off 1m20s restarting failed container=nginx pod=nginx-deployment-5f4c7db77b-phbq8_demo(74038531-e362-45a6-a436-cf1a6ea46d8a)
2 demo/nginx-deployment-5f4c7db77b-shkw6(Deployment/nginx-deployment)
- Error: back-off 1m20s restarting failed container=nginx pod=nginx-deployment-5f4c7db77b-shkw6_demo(2603f332-3e1c-45da-8080-e34dd6d956ad)

✨ kubectl-ai project:

This project is a kubectl plugin to generate and apply Kubernetes manifests using OpenAI GPT. To make it more interesting and understand in a visualized way, we will leverage Kubectl + OpenAI plugin.

Installation:

Homebrew:

Add to brew tap and install with:

brew tap sozercan/kubectl-ai https://github.com/sozercan/kubectl-ai
brew install kubectl-ai

Krew:

Add to krew index and install with:

kubectl krew index add kubectl-ai https://github.com/sozercan/kubectl-ai
kubectl krew install kubectl-ai/kubectl-ai

Usage:

Prerequisites:

kubectl-ai requires a valid Kubernetes configuration and one of the following:

For OpenAI, Azure OpenAI or LocalAI, you can use the following environment variables:

export OPENAI_API_KEY=<your OpenAI key>
export OPENAI_DEPLOYMENT_NAME=<your OpenAI deployment/model name. defaults to "gpt-3.5-turbo-0301">
export OPENAI_ENDPOINT=<your OpenAI endpoint, like "https://my-aoi-endpoint.openai.azure.com" or "http://localhost:8080/v1">

If OPENAI_ENDPOINT variable is set, then it will use the endpoint. Otherwise, it will use OpenAI API.

Azure OpenAI service does not allow certain characters, such as ., in the deployment name. Consequently, kubectl-ai will automatically replace gpt-3.5-turbo to gpt-35-turbo for Azure. However, if you use an Azure OpenAI deployment name completely different from the model name, you can set AZURE_OPENAI_MAP environment variable to map the model name to the Azure OpenAI deployment name. For example:

export AZURE_OPENAI_MAP="gpt-3.5-turbo=my-deployment"

Demo:

Nginx Pod:

command:

kubectl ai "create an nginx pod"
kubectl ai "create an nginx pod"
✨ Attempting to apply the following manifest:
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
Use the arrow keys to navigate: ↓ ↑ → ←
? Would you like to apply this? [Reprompt/Apply/Don't Apply]:
+ Reprompt
▸ Apply
Don't Apply

Deployment:

Select “Reprompt” and type “make this into deployment”

Reprompt: make this into deployment
✨ Attempting to apply the following manifest:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80

ReplicaSet:

Reprompt: Scale to 3 replicas
Reprompt: Scale to 3 replicas
✨ Attempting to apply the following manifest:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80

✨ KoPylot: AI-Powered Kubernetes Assistant 🤖

KoPylot is an open-source AI-powered Kubernetes assistant. Its goal is to help developers and DevOps engineers to easily manage and monitor their Kubernetes clusters.

KoPylot features:

At the current version, KoPylot has four main features. These features can be translated into subcommands for the kopylot CLI. The subcommands are Audit, Chat, Ctl, and Diagnose. Let’s dive into these commands now.

🔍 Audit:

Audit resources, such as pods, deployments, and services. KoPylot will take a single resource and look for vulnerabilities based on its manifest file.

💬 Chat:

Ask KoPylot in plain English to generate kubectl commands. You will be able to review the command before running it 😉.

🩺 Diagnose:

You can use the diagnose tool to help you debug the different components of your application, such as pods, deployments, and services. The diagnose command will list for you possible fixes for the broken resource.

☸️ Ctl:

A wrapper around kubectl. All the arguments passed to the ctl subcommand are interpreted by kubectl.

How does KoPylot work?

At the moment, KoPylot works by extracting information from the Kubernetes resource description (kubectl describe …) or manifest and feeding it into OpenAI’s Davinci model together with a prompt. The prompt tells the model what to do with the Kubernetes resource.

The prompt is also responsible for guiding how the model should structure the output. For example, the prompt used for the Audit command asks the model to output the results as a two-column JSON containing the vulnerabilities and their Severities.

One of the goals in the roadmap is to make it possible to replace the OpenAI models by in-house hosted models. This would solve the issues of sending potentially sensitive data to OpenAI’s servers.

How good are the outputs?

I tested KoPylot on a dummy cluster with some broken pods, some with vulnerabilities and others without. What I noticed is that the Davinci model can give good directions when diagnosing a broken pod. Sometimes the advice will be too short for one to understand, but by running the diagnose command 2–3 times, it is possible to pinpoint the issue.

For the Chat command, I compared the outputs from Davinci and GPT-4. GPT-4 gave much better results from vague user prompts. So far I have only used GPT-4 via the ChatGPT UI, but I will definitely write a comparison once I get access to the API.

How can I use KoPylot?

You can use KoPylot by following the steps below:

  1. Request an API key from OpenAI.
  2. Export the key using the following command:
    export KOPYLOT_AUTH_TOKEN=<your_openai_key>
  3. Install Kopylot using pip:
    pip install kopylot
  4. Run Kopylot:
    kopylot --help

Overall, KoPylot is a useful tool for diagnosing and troubleshooting Kubernetes workloads. Its web-based chat interface and CLI make it easy to use and accessible to all levels of users.

Sadly, the project hasn’t seen any commits since the beginning of April, raising obvious concerns.

✨ Kopilot 🧑‍✈️: AI Kubernetes Expert:

Kopilot is the only one out of these projects written in Go. It covers two functions, troubleshooting and auditing. So what do they do?

Installation

macOS:

brew install knight42/tap/kopilot

Krew:

kubectl krew install kopilot

Usage:

Currently, you need to set two ENVs to run Kopilot:

  • Set KOPILOT_TOKEN to specify your token.
  • Set KOPILOT_LANG to specify the language, defaults to English. Valid options are Chinese, French, Spain, etc.
  • KOPILOT_TOKEN_TYPE will be available soon to let you specify AI services other than ChatGPT. Please stay tuned.

Highlight

  • Diagnose any unhealthy workload in your cluster and tell you what might be the cause
  • Audit Kubernetes resource and find the security misconfigurations

Imagine you have a Pod that is stuck in Pending or CrashLoopBackOff. This is when the kopilot diagnose command will come in handy. It will reach AI for help and print you its answer with possible explanations as to why this has happened.

Not sure if your Deployment is good enough? The kopilot audit command, using a similar approach, will check it against the well-known best practices and possible security misconfigurations.

This tool will use your OpenAI API token and the human language of your choice for the answers. The README also hints that there will be an option to use other AI services in the future.

Sadly, the project hasn’t seen any commits since the beginning of April, raising obvious concerns.

✨ Kubectl-GPT:

Kubectl-GPT is a kubectl plugin to generate kubectl commands from natural language input by using GPT model.

This plugin introduces the kubectl gpt command, whose sole mission is to make your wishes — i.e. requests stated in human language — come true in your Kubernetes cluster. Here are examples of what you can expect from this plugin as outlined in its documentation:

Installation:

Homebrew:

# Install Homebrew: https://brew.sh/
brew tap devinjeon/kubectl-gpt https://github.com/devinjeon/kubectl-gpt
brew install kubectl-gpt

Krew:

# Install Krew: https://krew.sigs.k8s.io/docs/user-guide/setup/install/
kubectl krew index add devinjeon https://github.com/devinjeon/kubectl-gpt
kubectl krew install devinjeon/gpt

Usage:

Run the command line tool with your natural language input to generate a kubectl command.

kubectl gpt "<WHAT-YOU-WANT-TO-DO>"

Commands generated by the GPT model may not always be perfect, please verify them before execution.

Prerequisites:

Before you start, make sure to set your OpenAI API key as an environment variable named OPENAI_API_KEY. You can get a key for using the OpenAI API at https://platform.openai.com/account/api-keys

You can add the following line to your .zshrc or .bashrc file:

export OPENAI_API_KEY=<your-key>

Examples:

It depends on the languages supported by the OpenAI GPT API.

# English
kubectl gpt "Print the creation time and pod name of all pods in all namespaces."
kubectl gpt "Print the memory limit and request of all pods"
kubectl gpt "Increase the replica count of the coredns deployment to 2"
kubectl gpt "Switch context to the kube-system namespace"
# Korean
kubectl gpt "현재 namespace에서 각 pod 별 생성시간 출력"
kubectl gpt "coredns deployment의 replica를 2로 증가"

This project is developed by a solitary enthusiast and hasn’t seen any updates since the end of May.

✨ kube-copilot:

Kubernetes Copilot powered by OpenAI.

Features:

  • Automate Kubernetes cluster operations using ChatGPT (GPT-4 or GPT-3.5).
  • Diagnose and analyze the potential issues for Kubernetes workloads.
  • Generate the Kubernetes manifests based on the provided prompt instructions.
  • Utilize native kubectl and trivy commands for Kubernetes cluster access and security vulnerability scanning.
  • Access the web and perform Google searches without leaving the terminal.

Install:

Run in Kubernetes:

Option 1: Web UI with Helm (recommended)

# Option 1: OpenAI
export OPENAI_API_KEY="<replace-this>"
helm install kube-copilot kube-copilot \
--repo https://feisky.xyz/kube-copilot \
--set openai.apiModel=gpt-4 \
--set openai.apiKey=$OPENAI_API_KEY

# Option 2: Azure OpenAI Service
export OPENAI_API_KEY="<replace-this>"
export OPENAI_API_BASE="<replace-this>"
helm install kube-copilot kube-copilot \
--repo https://feisky.xyz/kube-copilot \
--set openai.apiModel=gpt-4 \
--set openai.apiKey=$OPENAI_API_KEY \
--set openai.apiBase=$OPENAI_API_BASE
# Forwarding requests to the service
kubectl port-forward service/kube-copilot 8080:80
echo "Visit http://127.0.0.1:8080 to use the copilot"

Option 2: CLI with kubectl

kubectl run -it --rm copilot \
--env="OPENAI_API_KEY=$OPENAI_API_KEY" \
--restart=Never \
--image=ghcr.io/feiskyer/kube-copilot \
-- execute --verbose 'What Pods are using max memory in the cluster'

Local Install:

Install the copilot with pip command below:

pip install kube-copilot

Setup:

  • Ensure kubectl is installed on the local machine and the kubeconfig file is configured for Kubernetes cluster access.
  • Install trivy to assess container image security issues (for the audit command).
  • Set the OpenAI API key as the OPENAI_API_KEY environment variable to enable ChatGPT functionality.
  • For Azure OpenAI service, also set OPENAI_API_TYPE=azure and OPENAI_API_BASE=https://<replace-this>.openai.azure.com/.
  • Google search is disabled by default. To enable it, set GOOGLE_API_KEY and GOOGLE_CSE_ID (obtain from here and here).

How to use web UI:

How to use CLI:

Running directly in the terminal:

Usage: kube-copilot [OPTIONS] COMMAND [ARGS]...
Kubernetes Copilot powered by OpenAI
Options:
--version Show the version and exit.
--help Show this message and exit.
Commands:
analyze analyze issues for a given resource
audit audit security issues for a Pod
diagnose diagnose problems for a Pod
execute execute operations based on prompt instructions
generate generate Kubernetes manifests

Audit Security Issues for Pod:

kube-copilot audit POD [NAMESPACE] will audit security issues for a Pod:

Usage: kube-copilot audit [OPTIONS] POD [NAMESPACE]
audit security issues for a Pod
Options:
--verbose Enable verbose information of copilot execution steps
--model MODEL OpenAI model to use for copilot execution, default is gpt-4
--help Show this message and exit.

Diagnose Problems for Pod:

kube-copilot diagnose POD [NAMESPACE] will diagnose problems for a Pod:

Usage: kube-copilot diagnose [OPTIONS] POD [NAMESPACE]
diagnose problems for a Pod
Options:
--verbose Enable verbose information of copilot execution steps
--model MODEL OpenAI model to use for copilot execution, default is gpt-4
--help Show this message and exit.

Analyze Potential Issues for k8s Object:

kube-copilot analyze RESOURCE NAME [NAMESPACE] will analyze potential issues for the given resource object:

Usage: kube-copilot analyze [OPTIONS] RESOURCE NAME [NAMESPACE]
analyze issues for a given resource
Options:
--verbose Enable verbose information of copilot execution steps
--model TEXT OpenAI model to use for copilot execution, default is gpt-4
--help Show this message and exit.

Execute Operations Based on Prompt Instructions:

kube-copilot execute INSTRUCTIONS will execute operations based on prompt instructions. It could also be used to ask any questions.

Usage: kube-copilot execute [OPTIONS] INSTRUCTIONS
execute operations based on prompt instructions
Options:
--verbose Enable verbose information of copilot execution steps
--model MODEL OpenAI model to use for copilot execution, default is gpt-4
--help Show this message and exit.

Generate Kubernetes Manifests:

Use the kube-copilot generate command to create Kubernetes manifests based on the provided prompt instructions. After generating the manifests, you will be prompted to confirm whether you want to apply them.

Usage: kube-copilot generate [OPTIONS] INSTRUCTIONS
generate Kubernetes manifests
Options:
--verbose Enable verbose information of copilot execution steps
--model TEXT OpenAI model to use for copilot execution, default is gpt-4
--help Show this message and exit.

✨ Kubernetes ChatGPT bot:

A ChatGPT1 bot for Kubernetes issues. Ask the AI how to solve your Prometheus alerts, get pithy responses.

No more solving alerts alone in the darkness — the internet has your back.

Please consider upvoting on Product Hunt or sending to your favorite newsletter. One day, Skynet will remember your kindness and spare you!

How it works:

Prometheus forwards alerts to the bot using a webhook receiver.

The bot sends a query to OpenAI, asking it how to fix your alerts.

You stockpile food in your pantry for the robot uprising.

The bot is implemented using Robusta.dev, an open source platform for responding to Kubernetes alerts. We also have a SaaS platform for multi-cluster Kubernetes observability.

Prerequisites:

  • A Slack workspace

Setup:

  1. Install Robusta with Helm
  2. Load the ChatGPT playbook. Add the following to generated_values.yaml:
playbookRepos:
chatgpt_robusta_actions:
url: "https://github.com/robusta-dev/kubernetes-chatgpt-bot.git"
customPlaybooks:
# Add the 'Ask ChatGPT' button to all Prometheus alerts
- triggers:
- on_prometheus_alert: {}
actions:
- chat_gpt_enricher: {}
  1. Add your OpenAI API key to generated_values.yaml. Make sure you edit the existing globalConfig section, don't add a duplicate section.
globalConfig:
chat_gpt_token: YOUR KEY GOES HERE
  1. Do a Helm upgrade to apply the new values: helm upgrade robusta robusta/robusta --values=generated_values.yaml --set clusterName=<YOUR_CLUSTER_NAME>
  2. Send your Prometheus alerts to Robusta. Alternatively, just use Robusta’s bundled Prometheus stack.

Demo:

Instead of waiting around for a Prometheus alert, lets cause one.

  1. Deploy a broken pod that will be stuck in pending state:
kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/pending_pods/pending_pod_node_selector.yaml
  1. Trigger a Prometheus alert immediately, skipping the normal delays:
robusta playbooks trigger prometheus_alert alert_name=KubePodCrashLooping namespace=default pod_name=example-pod

An alert will arrive in Slack with a button. Click the button to ask ChatGPT about the alert.

Resources:

Website: https://k8sgpt.ai/
Documentation: https://docs.k8sgpt.ai/
Slack
GitHub Organisation: https://github.com/k8sgpt-ai https://anaisurl.com/k8sgpt-full-tutorial/ https://github.com/sozercan/kubectl-ai

https://collabnix.com/the-rise-of-kubernetes-and-ai-kubectl-openai-plugin/

https://medium.com/@thiagoalves/introducing-kopylot-a-kubernetes-ai-assistant-264cff0e7846

https://github.com/knight42/kopilot

Until next time 🎉 🇵🇸

Photo by Tsuyuri Hara on Unsplash

I hope this post gave you a better understanding of how to manage application secrets.

Thank you for Reading !! 🙌🏻😁📃, see you in the next blog.🤘🇵🇸

🚀 Thank you for sticking up till the end. If you have any questions/feedback regarding this blog feel free to connect with me :

♻️ 🇵🇸LinkedIn: https://www.linkedin.com/in/rajhi-saif/

♻️🇵🇸 Twitter : https://twitter.com/rajhisaifeddine

The end ✌🏻

🔰 Keep Learning !! Keep Sharing !! 🔰

--

--

I build and break stuff, preferably in the cloud, ❤ OpenSource. Twitter: @rajhisaifeddine