Kubernetes

The smallest unit is the pod.
Usually one pod has one container.

Commands

kubectl get <type>
kubectl get all Show everything.
kubectl get pods Show pods
kubectl get pods --watch Show pods and show updates.
kubectl get pods --selector app=frontend,env=prod Show pods with the tag app=frontend and env=prod.
kubectl get pod nginx -o yaml > nginx.yml Export the definition of a pod to manually edit it and recreate it.
kubectl get pods -o custom-columns="NAME:.metadata.name,PRIORITY:.spec.priorityClassName" Show custom columns.
kubectl get <type> -o wide Show more info
kubectl get replicationcontroller Show status of Replication Controller (OLD).
kubectl get replicaset Show status of Replica Set.
kubectl get deployments Show deployments.
kubectl get services Show services.
kubectl get nodes -o wide Show kubertenes nodes available with IP.
kubectl get events -o wide Show events.
kubectl get pods -n kube-system Show system pods.
kubectl get all --all-namespaces Show everything everywhere all at once.
kubectl get all -A Show everything everywhere all at once.
kubectl describe <type> <name>
kubectl describe pod nginx Show detailed information for a pod.
kubectl describe service kubernetes Show detailed information of the default service.
kubectl describe daemonsets monitoring-daemon Show detailed information of a daemonset.
kubectl describe ds kube-proxy -n kube-system Show detailed information for kube-proxy daemonset on kube-system namespace.
kubectl create
kubectl create deployment nginx --image=nginx --replicas=3 Create a deployment with 3 replicas.
kubectl create deployment nginx --image=nginx --dry-run=client -o yaml > deployment.yml Create a defintion file for a deployment.
kubectl create -f file.yml Create a resource from a file.
kubectl create service loadbalancer webapp-service --tcp=30080:80 Create a service from a command
To create a DaemonSet you can create a deployment and edit it.
kubectl delete pod nginx Delete a pod.
kubectl apply -f file.yml Apply configuration changes to a pod from a file.
Edit
kubectl edit pod <name> Edit the definition of a pod.
kubectl edit replicaset <name>
kubectl delete <type> <name> Delete an object.
kubectl delete pod nginx Delete a pod.
kubectl apply -f file.yml Apply configuration changes from a file. Creates them if they don't exist.
kubectl apply -f /path/to/dir/ Apply all the files at once.
kubectl replace -f file.yml Replace currently running objects. If they don't exist, it will give an error. Useful to edit pod specs.
Replicas
kubectl replace --force -f file.yml Delete and create based on a file.
kubectl scale --replicas=6 -f file.yml Scale a RelicaSet. Does not modify the file.
kubectl scale --replicas=5 <type> <name> Scale a Replica Set
kubectl scale --replicas=5 replicaset <name-of-replicaset> Scale a Replica Set
kubectl delete replicaset <name>. It will also delete the PODs.
kubectl explain replicaset Explain what is a ReplicaSet (Help)
kubectl get replicaset Show all replicasets
kubectl describe replicaset Show more information
kubectl delete replicaset <name>
kubectl rollout status deployment/my-deployment Rollback the changes made in a deployment.
kubectl rollout history deployment/my-deployment Rollback the changes made in a deployment.
kubectl rollout undo deployment/my-deployment Rollback the changes made in a deployment.
Deployments
kubectl get deployments
Run
kubectl run nginx --image nginx:alpine Create and start a pod named nginx with the image nginx
kubectl run busybox --image busybox --dry-run=client -o yaml --command -- sleep 1000 Generate a pod manifest file, but don’t create the pod. The command option MUST be the last.
kubectl run redis --image=redis:alpine --labels="app=redis,env=prod" Create and strar a pod with labels.
kubectl run custom-nginx --image=nginx --port=8080 Tells kubernetes the port which the pod will have open.
Create
kubectl create deployment --image=nginx nginx
kubectl create deployment --image=nginx nginx --dry-run=client -o yaml
kubectl create deployment --image=nginx nginx --dry-run=client -o yaml > nginx-deployment.yaml
kubectl create deployment --image=nginx nginx --replicas=4 --dry-run=client -o yaml > nginx-deployment.yaml
kubectl create deployment httpd-frontend --image=httpd:2.4-alpine --replicas=3
Expose
kubectl expose pod nginx --port=80 --name=nginx-service Create a service for a pod.
kubectl expose deployment nginx --port=80 --target-port=8000 Create a service for an nginx deployment.
Run
kubectl run nginx --image nginx Create and start a pod named nginx with the image nginx
kubectl run nginx --image nginx --dry-run=client -o yaml Generate a pod manifest file, but don’t create the pod
Namespaces
kubectl get namespaces Get all existing namespaces.
kubectl get ns Short form to get all existing namespaces.
kubectl create namespace dev Create a new namespace
kubectl get all --namespace=kube-system Show everything in another namespace.
kubectl get pod -n=dev Short form to show pods in another namespace.
kubectl get all --all-namespaces Show everything.
kubectl create -f file.yml --namespace=dev Create a resource in another namespace.
kubectl config set-context $(kubectl config current-context) --namespace=dev Change the current namespace you are working on
Expose
kubectl expose deployment nginx --port 80 Creates a service to expose a service
Set
kubectl set image deployment nginx nginx=1.9.1 Change the image used in a deployment.
kubectl set image deployment/my-deployment nginx=1.9.1 --record Rollback the changes made in a deployment.
Metrics
kubectl top node Show resource consumption for each of the nodes.
kubectl top pod Show resource consumption for each pod.
Logs
kubectl logs -f pod Show logs for a pod with a single container.
kubectl logs -f pod container Show logs for a container running in a pod that has multiple containers.
Help
kubectl api-resources Show all the resource names available.
kubectl explain <resouce-name> Show generic documentation about that resource.
kubectl explain <resouce-name>.<field> Show specific documentation about that resource's field.
kubectl explain <resource> --recursive Show a comprehensive list of all the fields for a given resource.
kubectl explain pod.spec.containers

Kubernetes Objects

Services

Allow communication internal and external between pods.

The Endpoints are the pods to which the service redirects traffic.

NodePort

apiVersion: v1
kind: Service
metadata:
  name: myapp-service
spec:
  type: NodePort
  ports:
    - targetPort: 80
      port: 80 
      nodePort: 30008
  selector:
    app: myapp
    type: frontend

The only mandatory field is port
port: Port from which the service will access the pod.
targetPort: Port of the pod. By default, if not provided, is the same as port.
nodePort: Port mapped in the host. If not defined, will be mapped to a port between 30000 and 32767.
selector is the way the service attaches to a container.
If there are multiple pods it acts as a load balancer with a random assignment but with Session Affinity.
If the pods are in different nodes, if will handle everything itself and you will be able to access the pods from the IP of any node.

ClusterIP

For allowing inter-pod communication. It creates a common service for multiple pods

apiVersion: v1
kind: Service
metadata:
  name: backend
spec:
  type: ClusterIP
  ports:
    - targetPort: 80
      port: 80 
  selector:
    app: myapp
    type: backend

ClusterIP is the dafault type of Service.
targetPort: the port of the pod
port: the port of the service
Other pods can access the service using the cluster IP or the service name.

LoadBalancer

Needs a loadbalancer to work

apiVersion: v1
kind: Service
metadata:
  name: backend
spec:
  type: LoadBalancer
  ports:
    - targetPort: 80
      port: 80 
  selector:
    app: myapp
    type: backend

Namespaces

'default': the default namespace created by default.
'kube-system': namespace for internal services needed by kubernetes
'kube-public': namespaces for resources available to all users

Each namespace has rules, quotas, resources,

You can connect to the service backend in another namespace dev like: backend.dev.svc.cluster.local

Create a new namespace called dev

apiVersion: v1
kind: Namespace
metadata:
  name: dev

Simple pod in dev namespace

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  namespace: dev
spec:
  containers:
    - name: nginx
      image: nginx
      ports:
      - containerPort: 80

ResourceQuota

apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-quota
  namespace: dev
spec:
  hard:
    pods: "10"
    requests.cpu: "4"
    requests.memory: 5Gi
    limits.cpu: "10"
    limits.memory: 10Gi

Files

apiversion

Kind	Version
Pod	v1
Service	v1
ReplicaSet	apps/v1
Deployment	apps/v1

Examples

Simple pod

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
    - name: nginx
      image: nginx
      ports:
      - containerPort: 80

Complex pod

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  labels: ## As many as you want.
    app: myapp
    type: frontend
  annotations:
    buildversion: 1.34
spec:
  containers:
    - name: frontend-nginx
      image: nginx:1.14.2
      ports:
      - containerPort: 80
  tolerations:
    - key: "app"
      operator: "Equal"
      value: "blue"
      effect: "NoSchedule"
  priorityClassName: high-priority
  schedulerName: my-custom-scheduler

ReplicationController (OLD)

apiVersion: v1
kind: ReplicationController
metadata:
  name: myapp-rc
  labels:
    app: myapp
    type: frontend
spec:
  template: ## Same as the Pod definition
    metadata:
      name: nginx
      labels:
        app: myapp
        type: frontend
    spec:
     containers:
      - name: frontend-nginx
        image: nginx
        ports:
        - containerPort: 80
  replicas: 3 ## in yaml, this is a children of spec.

ReplicaSet

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: myapp-replicaset
  labels:
    app: myapp
    type: frontend
spec:
  template: ## Same as the Pod definition
    metadata:
      name: nginx
      labels:
        app: myapp
        type: frontend
    spec:
     containers:
      - name: frontend-nginx
        image: nginx
        ports:
        - containerPort: 80
  replicas: 3
  selector: ## Main difference with ReplicationController
    ## ReplicaSet can manage containers not created by the ReplicaSet
    matchLabels:
      type: frontend ## Labels of the Pods, not the labels of the ReplicaSet

The labels under metadata are the labels of the replicaset. The labels under spec/template/metadata are the labels of the pod that will be created. And the labels under spec/selector are the labels the replicaset will look for to monitor the pods. You must be carefull and use only the required labels in the selector.

Deployments. Higher than ReplicaSets. Multiple instances, download updated images, rolling updates, rollback, cohesive updates. Same as ReplicaSet, but changing the kind. A Deployment creates a ReplicaSet

apiVersion: apps/v1
kind: Deployment ## Case sensitive!
metadata:
  name: myapp-replicaset
  labels:
    app: myapp
    type: frontend
spec:
  template: ## Same as the Pod definition
    metadata:
      name: nginx
      labels:
        app: myapp
        type: frontend
    spec:
     containers:
      - name: frontend-nginx
        image: nginx
        ports:
        - containerPort: 80
  replicas: 3
  selector:
    matchLabels:
      type: frontend

Services

Allow communication internal and external.

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  selector:
    app: frontend
  ports:
  - procotol: TCP
    port: 80
    targetPort: 8376

DaemonSets

Runs a copy of the pod on each node of the cluster. Whenever a new node is added to the cluster, a replica of the pod is added to the new node. One copy of the pod is always present in all the nodes. Useful for monitoring, logs, kube-proxy or networking (calico). The kube-scheduler has no effect on these pods.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: monitoring-daemon
spec:
  selector:
    matchLabels:
      app: monitoring-agent
  template:
    metadata:
      labels:
        app: monitoring-agent
    spec:
     containers:
      - name: monitoring-agent
        image: monitoring-agent

Scheduling

The way a pod is assigned to a node. If there is no scheduler, the pod stays in "Pending" state.

Manual

Manually set the kubernetes node you want to run the pod in with the nodeName parameter and recreate it with kubectl replace --force -f file.yml.

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  nodeName: node02
  containers:
    - name: nginx
      image: nginx
      ports:
      - containerPort: 80

Move a pod WIP

Create a binding definition with the destination node

apiVersion: v1
kind: Binding
metadata:
  name: nginx
target:
  apiVersion: v1
  kind: Node
  name: node02

And make a request to the binding

curl --header "Content-Type: application/json" --request POST --data '{ HERE_GOES_THE_SAME_DATA_BUT_IN_JSON_FORMAT_ }' http://$SERVER/api/v1/namespaces/default/pods/$PODMAN/binding/

Multiple schedulers

You can write your own scheduler with custom checks. https://kubernetes.io/docs/tasks/extend-kubernetes/configure-multiple-schedulers/. You define the scheduler in the pod definition. You can check which scheduler was used to deploy a container with kubectl get events -o wide. You can also check the logs of the scheduler kubectl logs my-custom-scheduler --namespace=kube-system.

Scheduler profiles

When a pod is created, it goes to a scheduling queue. They get sorted by priority. Then it gets filtered to not be scheduled in nodes that can not run the pod. Then the pod gets scored based on different weights like free space on nodes. Finally the pod gets bound to the node with the higher score.

This is achieved with the following plugins: - Sorting: PrioritySort - Filtering: NodeResourcesFit, NodeName, NodeUnschedulable, NodeResourcesFit, TaintToleration, NodePorts, NodeAffinity - Scoring: NodeResourcesFit, ImageLocality, NodeResourcesFit, TaintToleration, NodeAffinity - Binding: DefaultBinder

There is also prefilter, postfilter, prescore, reserve, permit, prebind, postbind

You can configure multiple profiles with a single scheduler. For more info Scheduler Configuration | Kubernetes

Taint

Kubernetes docs - taint and toleration

A taint restricts which pods can be scheduled to a node. A toleration allows a pod to be scheduled in a tainted node as long as the toleration is the same as the taint. A tolerant pod does not belong to a taint node, it can be scheduled in any non tainted node and the tainted node which the pod is tolerant to.

kubectl taint nodes node01 app=myapp:NoSchedule
kubectl taint nodes node-name key=vaule:taint-effect where taint-effect is NoSchedule to not place them on the node, PreferNoSchedule to prefer to not place them on the node or NoExecute to evict (kill) existing objects on that node.
kubectl taint nodes controlplane node-role.kubernetes.io/control-plane- remove the taint on the controlplane node

You can see a taint on a node with kubectl describe node node1

Node selectors

kubectl label nodes node-name key:value like kubectl label nodes node01 size:Large. Node selectors does not support advanced logic like place the pod in a large or medium node but never in a small node. That is achieved with node affinity.

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  nodeSlector:
    size: Large
  containers:
    - name: nginx
      image: nginx
      ports:
      - containerPort: 80

Node affinity

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
    - name: nginx
      image: nginx
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
            - key: size
              operator: In
              values:
                - Large
                - Medium

Or

            - key: size
              operator: NotIn
              values:
                - Small

Or

          - matchExpressions:
            - key: size
              operator: Exists

Node affinity types: - requiredDuringSchedulingIgnoredDuringExecution: when first created will only be placed on the specified nodes. If a matching node does not exist, the pod will not be scheduled. - preferredDuringSchedulingIgnoredDuringExecution: when first created will try to place the pod on the specified nodes.

You can dry-run the creation of an object and then manually change the labels. k create deployment --dry-run=client red --image=nginx --replicas=2 -o yaml > red.yml.

Affinity is applied to a pod. (needs double checking if this is wright or wrong.)

Resource limits

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
    - name: nginx
      image: nginx
      ports:
      - containerPort: 80
      resources:
        request:
          memory: "1Gi"
          cpu: 2
        limits:
          memory: "2Gi"
          cpu: 4

Kubernetes does not have any request or limit by default. Requests are the minimum required specs for the container. Limits are the maximum allowed. They are defined per container inside the pod.

There are four ways to limit resources: - No limits. This is bad because one pod can suffocate the rest. - Limits but no requests. This is bad because there will be unused resources and over provisioning can suffocate a container. - Limits and requests. This is bad because there will be unused resources. - Requests. The is the best because it guarantees a certain ammount of resources for each container but allows every container to use as much as they request.

This does not mean that limits are not useful.

1 cpu is 1 vcpu (thread). you can especify 1 cpu as 1000m and go as low as 1m.

memory can be 256Mi or 268M or 1G or 1Gi or 512K

If the container inside the pod exeeds the CPU usage, it throttles the process. If it exceeds the memory usage, OOM killer kicks in and will terminate the container.

LimitRange

apiVersion: v1
kind: LimitRange
metadata:
  name: cpu-resource-constrain
spec:
  limits:
  - default:
      cpu: 500m
    defaultRequest:
      cpu: 500m
    max:
      cpu: "1"
    min:
      cpu: 100m
    type: Container

Resource Quotas

They are applied to namespaces

apiVersion: v1
kind: ResourceQuota
metadata:
  name: my-resource-quota
spec:
  hard:
    requests.cpu: 4
    requests.memory: 4Gi
    limits.cpu: 10
    limits.memory: 10Gi

Static Pods

Kubelet can manage pods on a single node without a master node. Kubelet can create a static pod from definition files stored at /etc/kubernetes/manifests/ (the option "--pod-manifest-path=" or, in the configuration file, staticPodPath).

Useful for deploying master nodes. Is what kubeadmin uses. The kube-scheduler has no effect on these pods. The name of the node is appended in the name of the pod; kube-schedules-node1 for example.

Priority classes

Priority for different workloads. It is a range of numbers. Larger number grater priority. System has a higher priority than the rest. If a higher priority can not be scheduled, the scheduler will terminate a lower priority workload. They are not attached to a namespace. From 2 billion to 1 billion is reserved for system. From 1 billion to negative 2 billion is for the rest. By default the priority for a pod is 0. By default there may not be any priority classes.

kubectl get priorityclass

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority
value: 1000000000
description: "Priority class for mission critical pods"
## Use this priority by default for future pods.
# globalDefault: true
## What to do when there are no resources left and a higher priority job gets scheduled
## By default is PreemptLowerPriority
# preemptionPolicy: never

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
    - name: nginx
      image: nginx
      ports:
      - containerPort: 80
  priorityClassName: high-priority

Logging

Kubelet has a component called cAdvisor and is responsible for retreiving performace metrics from pods and exposing them over the kubernetes API.

Application Lifecycle Management

etcd

Key-Value store. Handles all of the data for the cluster. Can be set as a distributed system.

etcdctl get / --prefix -keys-only Get all the keys in the database etcdctl snapshot save etcdctl endpoint health etcdctl get etcdctl put

Tools

Kompose conversion tool for Docker Compose to container orchestrators
ContainerSSH launches a new container for each SSH connection

Links

https://kubernetes.io/docs/concepts/
Rancher K3s orchestrator.
Harbor Registry for images and Helm charts
Talos Linux