Argo: Workflow Engine for Kubernetes

Published in

ITNEXT

10 min readMay 23, 2019

Argo from Applatix is an open source project that provides container-native workflows for Kubernetes implementing each step in a workflow as a container. Argo enables users to launch multi-step pipelines using a custom DSL that is similar to traditional YAML. The framework provides sophisticated looping, conditionals, dependency-management with DAG’s etc. which facilitates increased flexibility in deploying application stacks and flexibility of configuration and dependencies. With Argo, users can define dependencies, programmatically construct complex workflows, artifact management for linking the output of any step as an input to subsequent steps and monitor scheduled jobs in an easy to read UI.

Argo V2 is implemented as a Kubernetes CRD (Custom Resource Definition). As a result, Argo workflows can be managed using kubectl and natively integrates with other Kubernetes services such as volumes, secrets, and RBAC. The new Argo software is lightweight and installs in under a minute but provides complete workflow features including parameter substitution, artifacts, fixtures, loops and recursive workflows.

The workflow automation in Argo is driven by YAML templates (easy to adopt as Kubernetes majorly uses the same DSL) that are designed using ADSL (Argo Domain Specific Language). Every instruction that is provided using ADSL is treated as a piece of code and hosted alongside with application code in SCM (Source Code Management). Argo provides/supports six different YAML constructs: Container Templates : creating a single container and parameters as required, Workflow Templates: defining a job, in other words short-running-app which runs to completion. A step in a workflow can be a container, Policy Templates: rules for triggering/invoking a job or a notification, Deployment Template: create long running applications, Fixture Templates: glues third-party resources outside Argo, Project Templates: workflow definitions that can be accessed in Argo catalog.

Argo supports several different ways in which Kubernetes manifests can be defined: Ksonnet applications, Helm charts and simple directory of YAML/json manifests.

Basic Argo Workflow Template

the template below is a simple template where a workflow is defined to create a pod with two containers where one container has curl (Alpine-based image with just curl) and other being a Nginx sidecar container (sidecar runs alongside the service as a second process and provides ‘platform infrastructure features’). Here curl is a ‘main’ container which polls the nginx sidecar container until it is ready to service requests.

Events in the Workflow above:

Nginx Sidecar performing CURL operations

As shown above the workflow creates a pod and executes the configuration defined in the spec and kills the containers marking the pod state as completed.

Argo Dashboard View

Workflow Template with Conditionals

Argo supports conditional patters in workflow execution. Coinflip example provided by argo describes how “when” can be used in the template where the execution is dependent on the output received from the parent.

Workflow above runs a random integer script with constants 0=heads and 1=tails, where invoking of a particular template (heads or tails) depend on the output of “flip-coin” task as seen above in the workflow diagram, the flip-coin task yielded heads where only heads template is executed.

The above workflow creates two containers one for executing the randomint python script and the other to execute the heads template based on the result from the script. As shown below the heads template gets executed based on the result (result==heads) from randomint script.

Similarly a recursion (templates invoking recursively) can be added to the conditional spec. For example, the template output below shows flip-coin template is executed n times until the result is heads.

Workflow Template with Loops and Parameters

Argo facilitates iterating over a set of inputs in workflow template and users can provide parameters (example: input parameters). In the workflow below a Whalesay template is executed with two input parameters “hello kubernetes” and “hello argo”.

Looping and Parameter based Workflow Configuration

The template above comprise a single template that will consume a list of items and run the task depending on the number of parameter values provided. So, this creates two pods one executing the workflow consuming parameter “hello kubernetes” and the other using “hello argo” as shown below:

Workflow Execution with Looping Configuration

Similarly complex looping: iterating over a set of items, generating list of items dynamically is supported.

Multi-Step Workflow Template with Dependencies Defined by DAG (Directed Acyclic Graph) and Steps

Argo allows users to launch multi-step pipelines using a simple Python object DAG (Directed Acyclic Graph) and also facilitate to define more than one template in a workflow spec (nested workflows)

DAG Dependency based Workflow Configuration

Similarly Steps can be used to define a multi-step workflow, the example below comprise two templates hello-hello-hello and whalesay (nested workflow).

Multi-Step Workflow using Steps

Here the flow is controlled based on the steps and running format is defined based on the dashes.

Artifact Management with Minio and Integration with Argo

Artifacts are integral components of any workflow (example: CI-CD), steps in the workflow generate artifacts and the same will then be consumed by other subsequent steps.

Here the artifact repository used is Minio : open source object storage server with Amazon S3 compatible API.

The above workflow consists of two steps 1. Generate Artifact: generates an artifact using whalesay template and 2. Consume Artifact: consumes the artifact create in step-1 and prints the message. As defined in the configuration both the tasks run in sequence. The artifactory generated in the first step is stored in Minio, a artifactory repository can be integrated with Argo by providing the configuration below in the workflow-configmap.

WorkFlow Configuration for Artifactory Management

The artifactory created above is stored in Minio in my-bucket. The consume artifactory task will pull the artifactory based on the configuration provided. Minio resembles S3 and provides a shareable link to consume the artifactory as shown below.

Deploying a Full Application Stack (App, DB, Monitoring & Logging and Tracing) With long Running Applications (Deployments and Services) with Argo

As seen above Argo supports various constructs that can be used to deploy streamlined workflows to create complex applications stacks with a very well defined rule-set. In the example below a full-fledged web application (sample sock-shop app) is deployed with logging (using Elastic Search, Kibana and FluentD), Monitoring (Prometheus and Grafana), Tracing (Zipkin).

Argo can consume various Kubernetes configurations (Deployments, Services, Cluster Roles etc.) provided using traditional YAML based files. In this example a workflows directory holds all the Argo based workflow configurations which consume the Kubernetes object configurations from Kubernetes-spec directory.

Kubernetes Spec Directory Contents:

As seen below the application stack configurations are segregated into logging, monitoring, db (shock-shop-db), app (sock-shop) and zipkin (tracing) as seperate sub-directories wihich constitute service specific deployments, services, configmaps, cluster-roles etc. (basically all Kubernetes configurations required to deploy the app). For example Zipkin contain: Mysql deployment, Mysql service, Zipkin deployment, Zipkin service.

kubernetes-spec
├── logging
│   ├── elasticsearch.yml
│   ├── fluentd-crb.yml
│   ├── fluentd-daemon.yml
│   ├── fluentd-sa.yaml
│   └── kibana.yml
├── logging-cr
│   └── fluentd-cr.yml
├── monitoring
│   ├── grafana-dep.yaml
│   ├── grafana-svc.yaml
│   ├── prometheus-alertrules.yaml
│   ├── prometheus-configmap.yaml
│   ├── prometheus-crb.yml
│   ├── prometheus-dep.yaml
│   ├── prometheus-exporter-disk-usage-ds.yaml
│   ├── prometheus-exporter-kube-state-dep.yaml
│   ├── prometheus-exporter-kube-state-svc.yaml
│   ├── prometheus-sa.yml
│   └── prometheus-svc.yaml
├── monitoring-cfg
│   ├── grafana-configmap.yaml
│   └── grafana-import-dash-batch.yaml
├── monitoring-cr
│   └── prometheus-cr.yml
├── sock-shop-db
│   ├── carts-db-dep.yaml
│   ├── carts-db-svc.yaml
│   ├── catalogue-db-dep.yaml
│   ├── catalogue-db-svc.yaml
│   ├── orders-db-dep.yaml
│   ├── orders-db-svc.yaml
│   ├── session-db-dep.yaml
│   ├── session-db-svc.yaml
│   ├── user-db-dep.yaml
│   └── user-db-svc.yaml
├── sock-shop-persist-db
│   ├── carts-db-ss.yaml
│   ├── carts-db-svc.yaml
│   ├── catalogue-db-dep.yaml
│   ├── catalogue-db-svc.yaml
│   ├── orders-db-ss.yaml
│   ├── orders-db-svc.yaml
│   ├── session-db-dep.yaml
│   ├── session-db-svc.yaml
│   ├── user-db-ss.yaml
│   └── user-db-svc.yaml
├── sock-shop-usvc
│   ├── carts-dep.yaml
│   ├── carts-svc.yml
│   ├── front-end-dep.yaml
│   ├── front-end-svc.yaml
│   ├── orders-dep.yaml
│   ├── orders-svc.yaml
│   ├── queue-master-dep.yaml
│   ├── queue-master-svc.yaml
│   ├── rabbitmq-dep.yaml
│   ├── rabbitmq-svc.yaml
│   ├── shipping-dep.yaml
│   └── shipping-svc.yaml
├── sock-shop-zipkin-usvc
│   ├── catalogue-dep.yaml
│   ├── catalogue-svc.yaml
│   ├── payment-dep.yaml
│   ├── payment-svc.yaml
│   ├── user-dep.yaml
│   └── user-svc.yaml
└── zipkin
    ├── zipkin-cron-dep.yml
    ├── zipkin-dep.yaml
    ├── zipkin-mysql-dep.yaml
    ├── zipkin-mysql-svc.yaml
    └── zipkin-svc.yaml10 directories, 63 files

Workflows Directory Contents:

The workflows directory comprise all Argo based workflow configurations which consume the Kubernetes spec file above.

workflows/
├── logging-workflow.yaml
├── monitoring-workflow.yaml
├── sock-shop-persist-workflow.yaml
└── sock-shop-workflow.yaml0 directories, 4 files

Looking at an example of logging-workflow above:

Similarly, all the specific workflows use specific Kubernetes resources to deploy the application seamlessly as shown in the workflows below.

Logging

As shown above the Logging workflow creates ElasticSearch: deployment and service, Kibana: deployment and service, Fluentd: service account, daemonset.

Monitoring

As shown above the Monitoring workflow creates required resources for a running Prometheus and Grafana stack.

Kubernetes Resources deployed by Argo WorkFlow engine

Sock-Shop

As seen above this workflow constitute a multi-application/template workflow where Zipkin is deployed along with sock-shop which has dependency requirements.

With this a full-fledged application is online along with logging, monitoring and tracing. All the required services will be created based on the configuration provided.

Kubernetes Services for the Components created above

* Front-End (Sock-Shop Web)

Shock-Shop Frontend Web Application — Catalogue

* Logging (Kibana)

* Monitoring (Prometheus and Grafana)

Visualizing Prometheus derived metrics on Grafana

* Tracing (Zipkin)

Argo by combining a rich feature-set of workflow engine with native artifact management, admission control, “fixtures”, built-in support for DinD (Docker-in-Docker), and policies can be of tremendous importance in scenarios such as traditional CI/CD pipelines, complex jobs with both sequential and parallel steps and dependencies, orchestrating deployments of distributed applications and trigger time/event-based workflows based on policy architecture.