Concurrency and Disposability — of 12-factor Microservices

Santosh Pai
ITNEXT
Published in
6 min readOct 20, 2023

--

This is an in-depth series on the 12-factor methodology for Kubernetes-based microservices. So far, we’ve meticulously covered seven pivotal factors, from Codebase to Port Binding, each contributing to a well-oiled microservices machinery. If you recall from our discussion on port binding, efficient inter-service communication sets the stage for effective scaling, a subject that naturally brings us to our next two factors: Concurrency and Disposability. These factors are pivotal in ensuring your applications not only scale effectively but also maintain resilience and agility.

The Eighth Factor — Concurrency

Scale out via the process model.

Concurrency is a first-class citizen in the realm of distributed systems. In the context of 12-factor apps, it encapsulates the practice of breaking down an application into isolated, stateless, and share-nothing units that can independently respond to incoming requests and capable of scaling itself independently.

Kubernetes and Concurrency

Scaling Pods

In Kubernetes, concurrency is often achieved through replica sets or deployments. The share-nothing, horizontally partitionable nature of twelve-factor app processes means that adding more concurrency is a simple and reliable operation. With a single command or a minor change in your kustomized YAML configuration, you can scale your service replicas up or down.

apiVersion: apps/v1
kind: Deployment
metadata:
name: data-collector-depl
spec:
replicas: 3

By setting the replicas field, you can control the level of concurrency directly. Each replica is an instance of your application running independently, thus embodying the 12-factor principle of concurrency.

While replicas are the most straightforward and commonly discussed method for achieving concurrency in Kubernetes, they are far from the only way. Kubernetes offers several other constructs and patterns to achieve concurrency, some of which include:

  • Horizontal Pod Autoscale (HPA): While setting a static number of replicas allows for a fixed level of concurrency, the Horizontal Pod Autoscaler (HPA) allows for dynamic scaling based on observed metrics like CPU utilization or custom metrics. This provides a more reactive approach to concurrency.
  • Jobs and CronJobs: For batch processing tasks, Kubernetes Jobs create one or more pods and ensure that a specified number of them successfully terminate. CronJobs take this a step further by running Jobs on a time-based schedule. Each Job instance is a concurrent operation.
apiVersion: apps/v1
kind: Job
metadata:
name: data-collector-depl
spec:
schedule: "* * * * *"
jobtemplate:
spec:
completions: 5
parallelism: 2
backoffLimit: 1
template:
spec:
restartPolicy: Never
...
  • Custom Resource Definitions (CRDs) and Operators: For more complex stateful applications, you can define your own custom resources and write an Operator to manage them, including handling concurrency logic tailored to your application’s needs.
  • Service Mesh: Using a service mesh like Istio or Linkerd, you can control the concurrency level at the service-to-service communication layer. For example, you can set up rate limiting or define circuit breakers to control the number of concurrent requests to a service.
  • Partitioning: For stateful applications like databases that support sharding or partitioning, you can achieve data-level concurrency by distributing data across multiple instances. When handling concurrency for databases versioning of modelled data is vital, and perhaps deserves a dedicated blog post! Will do if enough votes!

Each of these approaches has its own trade-offs and is suited to particular types of workloads or requirements. Often, a combination of these methods is used to achieve the desired level of concurrency and performance.

The Ninth Factor — Disposability

Maximize robustness — with fast startup and graceful shutdown.

Disposability focuses on the ability of an application to start fast and shut down gracefully, enabling robust and resilient systems. This facilitates fast elastic scaling, rapid deployment of code or config changes, and robustness of production deploys.

Graceful Lifecycle Management in Kubernetes

Kubernetes excels at managing the lifecycle of containerized applications. The platform uses liveness and readiness probes to ascertain the health and readiness of each service instance, thereby ensuring that traffic is appropriately routed and that failing instances are replaced.

apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres-statefulset
namespace: development
spec:
serviceName: "postgres"
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:12
envFrom:
- configMapRef:
name: postgres-configuration
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-credentials
key: password
ports:
- containerPort: 5432
name: postgresdb
volumeMounts:
- name: postgres-volume-mount
mountPath: /var/lib/postgresql/data
readinessProbe:
exec:
command:
- bash
- "-c"
- "psql -U$POSTGRES_USER -d$POSTGRES_DB -c 'SELECT 1'"
initialDelaySeconds: 15
timeoutSeconds: 2
livenessProbe:
exec:
command:
- bash
- "-c"
- "psql -U$POSTGRES_USER -d$POSTGRES_DB -c 'SELECT 1'"
initialDelaySeconds: 15
timeoutSeconds: 2
volumes:
- name: postgres-data
emptyDir: {}
- name: postgres-volume-mount
persistentVolumeClaim:
claimName: postgres-pvc

Graceful Termination: The Art of a Dignified Exit

In microservices (or any application), termination isn’t merely an endpoint — it’s a process that must be managed with precision and tact. While Kubernetes orchestrates the lifecycle of pods with laudable consistency, the onus is on the application itself to handle any lingering operations, such as active connections or in-flight transactions.

Let’s consider a Node.js application using a hypothetical wrapper to handle stream connections. The application must recognize and respond to termination signals (SIGINT for a manual interrupt and SIGTERM for a system-level termination) to close any active streams gracefully.

Here is how you can handle these signals to ensure a graceful termination:

process.on('SIGINT', () => wrapper.connection.close())
process.on('SIGTERM', () => wrapper.connection.close())

By explicitly listening for these signals and invoking the connection.close() method, the application ensures that all stream connections are terminated cleanly. This ensures data integrity and allows for a smoother transition during pod rescheduling or scaling operations.

Booting the application back up should follow similar progressive process startups. Progressive!

The Role of Persistent Storage: A Crucial Element in the Disposability Equation

When we talk about disposability, we often envision stateless applications that are ephemeral by nature, easily replaced by a new instance with no loss of data. However, what happens when your application needs to maintain state? This is where the concept of persistent storage comes into play, offering a nuanced layer to the notion of disposability.

Kubernetes StatefulSets: Bridging the Gap

In Kubernetes, StatefulSets provide a solution for running stateful applications and databases, ensuring that the deployed pods are unique and can maintain their state. Unlike a Deployment, where pods are interchangeable, each pod in a StatefulSet has a stable, unique identity.

apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres-statefulset
namespace: development
spec:
serviceName: "postgres"
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:12
...

Reconciling Statefulness with Disposability

StatefulSets allow you to enjoy the benefits of disposability without sacrificing the need for persistent storage. When a pod in a StatefulSet is terminated, its Persistent Volume Claim (PVC) and the associated data remain intact. A new pod with the same identity can then be instantiated, mounting the existing PVC.

This allows for a level of “graceful disposability,” ensuring that while the compute resources (i.e., the pod) can be disposed of, the state is maintained, thereby reconciling the inherent tension between statefulness and disposability.

apiVersion: v1
kind: PersistentVolume
metadata:
namespace: development
name: postgres-pv
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /mnt1/postgres-data
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
namespace: development
name: postgres-pvc
labels:
type: local
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
volumeName: postgres-pv

Achieving Concurrency and Disposability: A Practical Guide

Let’s explore how to implement these factors in a real-world application.

  1. Defining Replicas: Use the replicas field in your Deployment YAML to set the level of concurrency you need.
  2. Setting up Probes: Add liveness and readiness probes to your application code and reference them in your Kubernetes configurations.
  3. Graceful Termination: Handle SIGTERM signals in your application to shut down gracefully when Kubernetes decides to terminate a pod.
  4. Monitoring: Use tools like Prometheus and Grafana to monitor how well your application is handling concurrency and to ensure that instances are disposable.
  5. Resource Limits and Requests: Setting resource limits and requests in Kubernetes can impact both concurrency and disposability. This plays into autoscaling as well.

Next up

We set the stage for a topic that has not only captivated the attention of developers but also spawned an entire sub-discipline: Dev/Prod Parity. This subject is so seminal that it has given birth to a plethora of tools, platforms, and even catalyzed the emergence of Platform Engineering as a distinct field — a topic that promises to be both interesting and, dare we say, transformational. Stay tuned!

And here is a set of restless tracks: My Soul Jean Vayat, Evelynka

--

--