Kubernetes Ugly Commands: Check for RWO Persistent Volume Node Attachment Issues

Shea Stewart
ITNEXT
Published in
4 min readJul 26, 2023

--

There have been a few instances when a Kubernetes pod that has persistent storage (of type ReadWriteOnce) won’t start. In some cases the issue has been related to the fact that the pod had moved to a different node but the PersistentVolume had challenges following it. This command can help identify if this is the case.

Ugly Command Name: Check for RWO Persistent Volume Node Attachment Issues

What does it do?

Searches through a namespace and finds pods with RWO attached persistent volumes. It outputs the node that the pod is scheduled on along with the node that the volume is attached to. The two should be the same, and it will print as message saying if this is the case or if there is a mismatch.

When would you use it?

Very rarely, but from time to time, a pod has an issue being rescheduled due to a persistent volume that can’t detach from a node and reattach to the new node (where the pod is scheduled). This command can quickly highlight if there are any mismatches within the namespace.

What is the command?

  • In a single line (it’s a long one!):
NAMESPACE="${NAMESPACE}"; CONTEXT="${CONTEXT}"; PODS=$(kubectl get pods -n $NAMESPACE --context=$CONTEXT -o json); for pod in $(jq -r '.items[] | @base64' <<< "$PODS"); do _jq() { jq -r ${1} <<< "$(base64 --decode <<< ${pod})"; }; POD_NAME=$(_jq '.metadata.name'); POD_NODE_NAME=$(kubectl get pod $POD_NAME -n $NAMESPACE --context=$CONTEXT -o custom-columns=:.spec.nodeName --no-headers); PVC_NAMES=$(kubectl get pod $POD_NAME -n $NAMESPACE --context=$CONTEXT -o jsonpath='{.spec.volumes[*].persistentVolumeClaim.claimName}'); for pvc_name in $PVC_NAMES; do PVC=$(kubectl get pvc $pvc_name -n $NAMESPACE --context=$CONTEXT -o json); ACCESS_MODE=$(jq -r '.spec.accessModes[0]' <<< "$PVC"); if [[ "$ACCESS_MODE" == "ReadWriteOnce" ]]; then PV_NAME=$(jq -r '.spec.volumeName' <<< "$PVC"); STORAGE_NODE_NAME=$(jq -r --arg pv "$PV_NAME" '.items[] | select(.status.volumesAttached != null) | select(.status.volumesInUse[] | contains($pv)) | .metadata.name' <<< "$(kubectl get nodes --context=$CONTEXT -o json)"); echo "-----"; if [[ "$POD_NODE_NAME" == "$STORAGE_NODE_NAME" ]]; then echo "OK: Pod and Storage Node Matched"; else echo "Error: Pod and Storage Node Mismatched - If the issue persists, the node requires attention."; fi; echo "Pod: $POD_NAME"; echo "PVC: $pvc_name"; echo "PV: $PV_NAME"; echo "Node with Pod: $POD_NODE_NAME"; echo "Node with Storage: $STORAGE_NODE_NAME"; echo; fi; done; done
  • In multiple lines with with some comments:
# Defining a few variables to make the script easier to follow 
NAMESPACE="${NAMESPACE}";
CONTEXT="${CONTEXT}";

# Retrieve and store the list of all KR pods in the "pods" variable
PODS=$(kubectl get pods -n $NAMESPACE --context=$CONTEXT -o json);

# Starting the main loop which iterates over every pod and does the checks
for pod in $(jq -r '.items[] | @base64' <<< "$PODS"); do
# Helper (or child) function for convenience that returns the value of requested JSON field
_jq() {
jq -r \${1} <<< "$(base64 --decode <<< \${pod})";
};

# Get pod name from the stored json
POD_NAME=$(_jq '.metadata.name');

# Get the node name where that pod is running
POD_NODE_NAME=$(kubectl get pod $POD_NAME -n $NAMESPACE --context=$CONTEXT -o custom-columns=:.spec.nodeName --no-headers);

# Retrieve the names of any PVCS associated with the pod
PVC_NAMES=$(kubectl get pod $POD_NAME -n $NAMESPACE --context=$CONTEXT -o jsonpath='{.spec.volumes[*].persistentVolumeClaim.claimName}');

# Iterate throught the list of PVCs, if available
for pvc_name in $PVC_NAMES; do

# Get details of the PVC in JSON format
PVC=$(kubectl get pvc $pvc_name -n $NAMESPACE --context=$CONTEXT -o json);

# Extract Access Mode from the PVC details
ACCESS_MODE=$(jq -r '.spec.accessModes[0]' <<< "$PVC");

# This condition ensures, we only check RWO type of PVCs
if [[ "$ACCESS_MODE" == "ReadWriteOnce" ]]; then

# Extract PV Name from the retrieved PVC details
PV_NAME=$(jq -r '.spec.volumeName' <<< "$PVC");

# Lookup the Node where the storage which backs the PVC is located
STORAGE_NODE_NAME=$(jq -r --arg pv "$PV_NAME" '.items[] | select(.status.volumesAttached != null) | select(.status.volumesInUse[] | contains($pv)) | .metadata.name' <<< "$(kubectl get nodes --context=$CONTEXT -o json)");

# Print some information
echo "-----";

# Check if Pod and Storage nodes match
if [[ "$POD_NODE_NAME" == "$STORAGE_NODE_NAME" ]]; then
echo "OK: Pod and Storage Node Matched";
else
echo "Error: Pod and Storage Node Mismatched - If the issue persists, the node requires attention.";
fi;

# Print additional useful information
echo "Pod: $POD_NAME";
echo "PVC: $pvc_name";
echo "PV: $PV_NAME";
echo "Node with Pod: $POD_NODE_NAME";
echo "Node with Storage: $STORAGE_NODE_NAME";
echo;
fi;
done;
done

What is some sample output?

-----
OK: Pod and Storage Node Matched
Pod: jenkins-0
PVC: jenkins
PV: pvc-ba67e928-8e9b-4a9a-be69-4f286e2e6af5
Node with Pod: gke-sandbox-cluster--sandbox-cluster--744aa9b5-5mvt
Node with Storage: gke-sandbox-cluster--sandbox-cluster--744aa9b5-5mvt
-----
OK: Pod and Storage Node Matched
Pod: jenkins2-0
PVC: jenkins2
PV: pvc-0f49efcb-f71e-4722-b0f2-c13ed7d647ea
Node with Pod: gke-sandbox-cluster--sandbox-cluster--744aa9b5-4fbx
Node with Storage: gke-sandbox-cluster--sandbox-cluster--744aa9b5-4fbx

What does it need?

  • kubectl
  • jq
  • base64
  • Permissions to list pod, pvc, pv, and node details

If you would like this command tailored for your environment and ready to copy & paste, this command has been added to the open source library of commands available through RunWhen Local. Check it out here: https://docs.runwhen.com/public/runwhen-local/getting-started/running-locally

Find an example of the command here: https://runwhen-local.sandbox.runwhen.com/jenkins/jenkins-PVC-Healthcheck/#check-for-rwo-persistent-volume-node-attachment-issues

Have an ugly command to share? Collaborate with us on GitHub with issues or discussions

This is part of a series. Check out this article to see additional ugly commands posted in the series.

--

--

Technologist working in SRE|DevOps|Platform|QA|Customer Engineering Capacities @ RunWhen