Skip to main content

Command Palette

Search for a command to run...

K8S-Schedule: Node Assignment & Affinity

Updated
4 min read

Intro

Kubernetes scheduling plays a crucial role in determining how workloads are assigned to Nodes within a cluster. When scheduling resources to a Node, there are two primary mechanisms:

  1. Node Assignment (Explicit Selection) – Directly assigning Pods to a specific Node using either a Node’s name or labels.

  2. Node Affinity (Preference-Based Selection) – Allowing Pods to be scheduled on preferred Nodes based on affinity rules.

Understanding these mechanisms is essential for optimizing workload placement and ensuring efficient resource utilization.

Demo

Node Assignment

To assign a pod to a node, we can use either name or labels:

# nodeName
$ kubectl explain pod.spec.nodeName
KIND:     Pod
VERSION:  v1

FIELD:    nodeName <string>

DESCRIPTION:
     NodeName indicates in which node this pod is scheduled. If empty, this pod
     is a candidate for scheduling by the scheduler defined in schedulerName.
     Once this field is set, the kubelet for this node becomes responsible for
     the lifecycle of this pod. This field should not be used to express a
     desire for the pod to be scheduled on a specific node.
     https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodename
# nodeSelector
$ kubectl explain pod.spec.nodeSelector
KIND:     Pod
VERSION:  v1

FIELD:    nodeSelector <map[string]string>

DESCRIPTION:
     NodeSelector is a selector which must be true for the pod to fit on a node.
     Selector which must match a node's labels for the pod to be scheduled on
     that node. More info:
     https://kubernetes.io/docs/concepts/configuration/assign-pod-node/

Name

Let’s define a pod as following:

apiVersion: v1
kind: Pod
metadata:
  name: pod-spec-node-name
spec:
  nodeName: kind-worker # Directly assigning the Pod to this Node
  containers:
    - name: nginx
      image: nginx-k8s:latest
      imagePullPolicy: Never
# Apply file create pod
$ kubectl apply -f pod_spec_node_name.yaml
pod/pod-spec-node-name created

# It is on the specified node!
$ kubectl get pods -o wide
NAME                 READY   STATUS    RESTARTS        AGE   IP           NODE           NOMINATED NODE   READINESS GATES
pod-spec-node-name   1/1     Running   0               6s    10.244.2.3   kind-worker    <none>           <none>

Label

I have a worker node with a custom label called custom-label:

$ kubectl get node kind-worker2 --show-labels
NAME           STATUS   ROLES    AGE    VERSION   LABELS
kind-worker2   Ready    <none>   3d1h   v1.32.0   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,custom-label=worker2,kubernetes.io/arch=amd64,kubernetes.io/hostname=kind-worker2,kubernetes.io/os=linux
apiVersion: v1
kind: Pod
metadata:
  name: pod-spec-node-label
spec:
  nodeSelector:
    custom-label: worker2 #Specify the label of nodes
  containers:
    - name: nginx
      image: nginx-k8s:latest
      imagePullPolicy: Never
# Create pod
$ kubectl apply -f pod_spec_node_label.yaml
pod/pod-spec-node-label created

# We can see that it is running on kind-worker2!
$ kubectl get pods -o wide
NAME                  READY   STATUS    RESTARTS      AGE     IP           NODE           NOMINATED NODE   READINESS GATES
pod-spec-node-label   1/1     Running   0             8s      10.244.1.3   kind-worker2   <none>           <none>

Node Affinity

Node Affinity provides more advanced scheduling capabilities than nodeSelector. It allows Pods to be scheduled based on preferred or required conditions. There are two types of Node Affinity:

Hard Node Affinity (Strict Requirement)

Hard affinity ensures that a Pod must be scheduled on a Node that meets specific conditions. If no Nodes match the criteria, the Pod remains unscheduled.

Configuration using requiredDuringSchedulingIgnoredDuringExecution:

apiVersion: v1
kind: Pod
metadata:
  name: nginx-strict-affinity-pod
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: env
                operator: In
                values:
                  - prod
  containers:
    - name: nginx-node-affinity-hard
      image: nginx-k8s:latest
      imagePullPolicy: Never

This ensures that the Pod is only scheduled on Nodes labeled with env=prod.

# get pods
$ kubectl get pods -o wide
NAME                        READY   STATUS    RESTARTS      AGE   IP           NODE           NOMINATED NODE   READINESS GATES
nginx-strict-affinity-pod   0/1     Pending   0             21s   <none>       <none>         <none>           <none> 

# There is a log indicating us the issue why the pod cannot be created
$ kubectl describe pod nginx-strict-affinity-pod
Name:             nginx-strict-affinity-pod
Namespace:        default
Priority:         0
Service Account:  default
Node:             <none>
Labels:           <none>
Annotations:      <none>
Status:           Pending
IP:
IPs:              <none>
Containers:
  nginx-node-affinity-hard:
    Image:        nginx-k8s:latest
    Port:         <none>
    Host Port:    <none>
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-mrxd7 (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  kube-api-access-mrxd7:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  56s   default-scheduler  0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.

Soft Node Affinity (Preferred Scheduling)

Soft affinity expresses a preference rather than a strict requirement. Kubernetes will attempt to schedule the Pod on a preferred Node, but if no matching Nodes are available, it will schedule the Pod elsewhere.

Configuration using preferredDuringSchedulingIgnoredDuringExecution:

apiVersion: v1
kind: Pod
metadata:
  name: nginx-soft-affinity-pod
spec:
  affinity:
    nodeAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 1
          preference:
            matchExpressions:
              - key: region
                operator: In
                values:
                  - eu-west
  containers:
    - name: nginx-node-affinity-soft
      image: nginx-k8s:latest
      imagePullPolicy: Never

In this case, Kubernetes will prefer scheduling the Pod on a Node labeled region=eu-west, but if no such Node is available, it will schedule the Pod elsewhere. Since we do not have a node with label region-eu-west, the pod will still be scheduled:

$ kubectl apply -f pod_soft_affinity.yaml
pod/nginx-soft-affinity-pod created

$ kubectl get pods -o wide
NAME                        READY   STATUS    RESTARTS      AGE     IP           NODE           NOMINATED NODE   READINESS GATES
nginx-soft-affinity-pod     1/1     Running   0             5s      10.244.2.4   kind-worker    <none>           <none>

More from this blog

Clarence's Blog

56 posts

I share insights on programming, web development, cloud computing, computer networks, and AI, alongside financial knowledge, reading notes, and reflections on business and entrepreneurship.