๐ข AlmaLinux Container Orchestration & Kubernetes Complete Guide
Ready to command a fleet of containers like a digital admiral? โ This comprehensive guide will transform you into a Kubernetes master, covering everything from basic container orchestration to advanced cluster management that powers the worldโs largest applications!
Container orchestration isnโt just about running containers โ itโs about creating an intelligent, self-healing infrastructure that automatically manages application deployment, scaling, and operations across multiple servers. Letโs build a container empire that scales to infinity! ๐
๐ค Why is Container Orchestration Important?
Imagine trying to manage thousands of containers manually โ itโs like conducting a symphony blindfolded! ๐ผ Hereโs why Kubernetes orchestration is revolutionary:
- ๐ Automatic Scaling: Your applications grow and shrink based on demand automatically
- ๐ก๏ธ Self-Healing: Failed containers are automatically replaced and restarted
- ๐ Load Distribution: Traffic is intelligently distributed across healthy containers
- ๐ฆ Simplified Deployment: Deploy complex applications with simple YAML files
- ๐พ Storage Orchestration: Automatically mount and manage persistent storage
- ๐ Rolling Updates: Update applications with zero downtime
- ๐ Secret Management: Securely manage passwords, tokens, and certificates
- ๐ฏ Resource Optimization: Maximize hardware utilization across your cluster
๐ฏ What You Need
Before we embark on this Kubernetes adventure, letโs make sure you have everything ready:
โ AlmaLinux servers (minimum 3 nodes for a proper cluster!) โ 2+ CPU cores per node (Kubernetes needs some computing power) โ 4GB+ RAM per node (more is better for container workloads) โ 20GB+ disk space per node (for system, containers, and data) โ Network connectivity between all nodes (they need to talk to each other!) โ Root or sudo access (needed for cluster setup and management) โ Basic container knowledge (weโll build on Docker concepts) โ Adventure spirit (weโre building something incredible!)
๐ Step 1: Preparing the Kubernetes Environment
Letโs prepare our AlmaLinux servers for Kubernetes greatness! Think of this as preparing the stage before the grand performance. ๐ญ
# Disable swap (Kubernetes requirement)
sudo swapoff -a
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# Kubernetes requires swap to be disabled for proper operation
# Configure system settings
cat << EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
# Load kernel modules
sudo modprobe overlay
sudo modprobe br_netfilter
# Set networking parameters
cat << EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
# Apply sysctl settings
sudo sysctl --system
Install container runtime (containerd):
# Install containerd
sudo dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo dnf install -y containerd.io
# Configure containerd
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
# Enable systemd cgroup driver
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
# Start and enable containerd
sudo systemctl enable --now containerd
# Verify containerd is running
sudo systemctl status containerd
Create a cluster preparation script:
# Create cluster preparation script
sudo nano /usr/local/bin/k8s-prep.sh
#!/bin/bash
echo "๐ข KUBERNETES CLUSTER PREPARATION"
echo "================================="
# Function to check prerequisites
check_prerequisites() {
echo "๐ Checking prerequisites..."
# Check CPU cores
CORES=$(nproc)
if [ $CORES -lt 2 ]; then
echo "โ ๏ธ Warning: Only $CORES CPU cores detected. Kubernetes recommends 2+"
else
echo "โ
CPU cores: $CORES"
fi
# Check memory
MEMORY_GB=$(free -g | awk 'NR==2{print $2}')
if [ $MEMORY_GB -lt 4 ]; then
echo "โ ๏ธ Warning: Only ${MEMORY_GB}GB RAM detected. Kubernetes recommends 4GB+"
else
echo "โ
Memory: ${MEMORY_GB}GB"
fi
# Check swap status
if [ $(swapon --show | wc -l) -eq 0 ]; then
echo "โ
Swap is disabled"
else
echo "โ Swap is enabled - Kubernetes requires swap to be disabled"
return 1
fi
# Check containerd
if systemctl is-active --quiet containerd; then
echo "โ
Containerd is running"
else
echo "โ Containerd is not running"
return 1
fi
}
# Function to install Kubernetes components
install_kubernetes() {
echo "๐ฆ Installing Kubernetes components..."
# Add Kubernetes repository
cat << EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.28/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.28/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF
# Install Kubernetes components
sudo dnf install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
# Enable kubelet service
sudo systemctl enable kubelet
echo "โ
Kubernetes components installed"
}
# Function to configure firewall
configure_firewall() {
echo "๐ฅ Configuring firewall..."
# Kubernetes master ports
sudo firewall-cmd --permanent --add-port=6443/tcp # API server
sudo firewall-cmd --permanent --add-port=2379-2380/tcp # etcd
sudo firewall-cmd --permanent --add-port=10250/tcp # kubelet
sudo firewall-cmd --permanent --add-port=10259/tcp # kube-scheduler
sudo firewall-cmd --permanent --add-port=10257/tcp # kube-controller-manager
# Worker node ports
sudo firewall-cmd --permanent --add-port=10250/tcp # kubelet
sudo firewall-cmd --permanent --add-port=30000-32767/tcp # NodePort services
# Container networking
sudo firewall-cmd --permanent --add-masquerade
# Reload firewall
sudo firewall-cmd --reload
echo "โ
Firewall configured for Kubernetes"
}
# Run all preparation steps
check_prerequisites
if [ $? -eq 0 ]; then
install_kubernetes
configure_firewall
echo ""
echo "๐ Kubernetes preparation complete!"
echo "๐ Next steps:"
echo " 1. Run this script on all nodes"
echo " 2. Initialize cluster on master: kubeadm init"
echo " 3. Join worker nodes to cluster"
else
echo "โ Prerequisites not met. Please fix issues and try again."
exit 1
fi
# Make the script executable and run it
sudo chmod +x /usr/local/bin/k8s-prep.sh
sudo /usr/local/bin/k8s-prep.sh
# Run this script on ALL nodes in your cluster!
๐ง Step 2: Initializing the Kubernetes Cluster
Time to birth your Kubernetes cluster! ๐ฃ Weโll create the master node and then add worker nodes.
# Initialize the Kubernetes cluster (run on master node only)
sudo kubeadm init --pod-network-cidr=192.168.0.0/16 --apiserver-advertise-address=$(hostname -I | awk '{print $1}')
# Replace with your actual master node IP if needed
# The init command will output a join command - SAVE IT!
# It looks like: kubeadm join <master-ip>:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash>
Configure kubectl for regular user:
# Set up kubectl for regular user (run as non-root user)
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
# Test cluster access
kubectl get nodes
# Should show your master node in "NotReady" status (normal until CNI is installed)
# Check cluster info
kubectl cluster-info
# Shows cluster endpoints and services
Install Calico CNI (Container Network Interface):
# Install Calico for pod networking
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/tigera-operator.yaml
# Download and apply Calico custom resources
curl https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/custom-resources.yaml -O
# Edit the custom resources to match our pod CIDR
sed -i 's/192.168.0.0\/16/192.168.0.0\/16/' custom-resources.yaml
# Apply Calico configuration
kubectl create -f custom-resources.yaml
# Wait for Calico pods to be ready
kubectl get pods -n calico-system --watch
# Press Ctrl+C when all pods are Running
Create cluster management script:
# Create cluster management helper
sudo nano /usr/local/bin/k8s-cluster.sh
#!/bin/bash
show_cluster_status() {
echo "๐ข KUBERNETES CLUSTER STATUS"
echo "============================="
echo ""
echo "๐ Nodes:"
kubectl get nodes -o wide
echo ""
echo "๐ฆ System Pods:"
kubectl get pods -n kube-system
echo ""
echo "๐ Cluster Info:"
kubectl cluster-info
echo ""
echo "๐ Resource Usage:"
kubectl top nodes 2>/dev/null || echo "Metrics server not installed"
}
join_worker_node() {
echo "๐ WORKER NODE JOIN COMMAND"
echo "==========================="
echo ""
# Generate new join token if needed
TOKEN=$(kubeadm token list | awk 'NR==2{print $1}')
if [ -z "$TOKEN" ]; then
echo "Generating new join token..."
TOKEN=$(kubeadm token generate)
kubeadm token create $TOKEN --print-join-command --ttl=24h
else
echo "Current join command:"
kubeadm token create --print-join-command --ttl=24h
fi
}
get_cluster_config() {
echo "โ๏ธ CLUSTER CONFIGURATION"
echo "========================"
echo ""
echo "Cluster configuration for external access:"
echo "Copy this to ~/.kube/config on your local machine:"
echo ""
kubectl config view --flatten --minify
}
install_dashboard() {
echo "๐ INSTALLING KUBERNETES DASHBOARD"
echo "==================================="
# Install metrics server first
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
# Install Kubernetes dashboard
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.7.0/aio/deploy/recommended.yaml
# Create dashboard admin user
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kubernetes-dashboard
EOF
echo "โ
Dashboard installed!"
echo ""
echo "To access dashboard:"
echo "1. Run: kubectl proxy"
echo "2. Visit: http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/"
echo "3. Get token: kubectl -n kubernetes-dashboard create token admin-user"
}
case "$1" in
status)
show_cluster_status
;;
join)
join_worker_node
;;
config)
get_cluster_config
;;
dashboard)
install_dashboard
;;
*)
echo "Usage: $0 {status|join|config|dashboard}"
echo " status - Show cluster status"
echo " join - Generate worker node join command"
echo " config - Show cluster config for external access"
echo " dashboard - Install Kubernetes dashboard"
;;
esac
# Make executable
sudo chmod +x /usr/local/bin/k8s-cluster.sh
# Check cluster status
sudo /usr/local/bin/k8s-cluster.sh status
๐ Step 3: Adding Worker Nodes
Letโs expand your cluster empire! ๐ Worker nodes will run your application workloads.
On each worker node, run the join command from the master:
# Get the join command from master node
# Run this on master node:
sudo /usr/local/bin/k8s-cluster.sh join
# Copy the output and run on each worker node
# Example (your command will be different):
sudo kubeadm join 192.168.1.100:6443 --token abc123.xyz789 \
--discovery-token-ca-cert-hash sha256:1234567890abcdef...
# Verify nodes joined (run on master)
kubectl get nodes
# Should show all nodes in Ready status
Label worker nodes for better organization:
# Label worker nodes (run on master)
kubectl label node worker-node-1 node-role.kubernetes.io/worker=worker
kubectl label node worker-node-2 node-role.kubernetes.io/worker=worker
# Add custom labels for workload placement
kubectl label node worker-node-1 workload-type=web
kubectl label node worker-node-2 workload-type=database
# Verify labels
kubectl get nodes --show-labels
Create node management script:
# Create node management script
sudo nano /usr/local/bin/k8s-nodes.sh
#!/bin/bash
show_node_details() {
echo "๐ฅ๏ธ NODE DETAILS"
echo "==============="
for node in $(kubectl get nodes -o name | cut -d/ -f2); do
echo ""
echo "๐ Node: $node"
echo "================================="
# Node info
kubectl describe node $node | grep -E "(Name:|Role|Internal IP|Kernel Version|Operating System|Container Runtime|Allocatable:|Allocated resources:)" -A 20
echo ""
echo "๐ Resource Usage:"
kubectl top node $node 2>/dev/null || echo "Metrics not available"
echo ""
echo "๐ฆ Pods on this node:"
kubectl get pods --all-namespaces --field-selector spec.nodeName=$node -o wide
done
}
drain_node() {
local node_name="$1"
if [ -z "$node_name" ]; then
echo "Usage: $0 drain <node-name>"
return 1
fi
echo "๐ฐ DRAINING NODE: $node_name"
echo "============================="
# Cordon the node (mark as unschedulable)
kubectl cordon $node_name
echo "โ
Node cordoned"
# Drain the node (evict all pods)
kubectl drain $node_name --ignore-daemonsets --delete-emptydir-data --force
echo "โ
Node drained"
echo ""
echo "Node $node_name is now ready for maintenance"
echo "To bring it back online: kubectl uncordon $node_name"
}
uncordon_node() {
local node_name="$1"
if [ -z "$node_name" ]; then
echo "Usage: $0 uncordon <node-name>"
return 1
fi
echo "๐ UNCORDONING NODE: $node_name"
echo "================================"
kubectl uncordon $node_name
echo "โ
Node $node_name is now schedulable again"
}
check_node_health() {
echo "๐ฅ NODE HEALTH CHECK"
echo "===================="
kubectl get nodes -o custom-columns=NAME:.metadata.name,STATUS:.status.conditions[-1].type,REASON:.status.conditions[-1].reason
echo ""
echo "๐ Node conditions:"
for node in $(kubectl get nodes -o name | cut -d/ -f2); do
echo ""
echo "Node: $node"
kubectl get node $node -o jsonpath='{.status.conditions[*].type}{"\n"}{.status.conditions[*].status}{"\n"}' | paste - -
done
}
case "$1" in
details)
show_node_details
;;
drain)
drain_node "$2"
;;
uncordon)
uncordon_node "$2"
;;
health)
check_node_health
;;
*)
echo "Usage: $0 {details|drain|uncordon|health} [node-name]"
echo " details - Show detailed node information"
echo " drain - Drain a node for maintenance"
echo " uncordon - Make a node schedulable again"
echo " health - Check node health status"
;;
esac
# Make executable
sudo chmod +x /usr/local/bin/k8s-nodes.sh
# Check node health
sudo /usr/local/bin/k8s-nodes.sh health
โ Step 4: Deploying Applications
Time to deploy your first applications! ๐ Weโll create deployments, services, and ingress controllers.
Create a sample web application:
# Create namespace for our application
kubectl create namespace demo-app
# Create deployment YAML
cat << EOF > web-app-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
namespace: demo-app
labels:
app: web-app
spec:
replicas: 3
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
spec:
containers:
- name: nginx
image: nginx:1.21
ports:
- containerPort: 80
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
volumeMounts:
- name: html-volume
mountPath: /usr/share/nginx/html
volumes:
- name: html-volume
configMap:
name: web-app-html
---
apiVersion: v1
kind: ConfigMap
metadata:
name: web-app-html
namespace: demo-app
data:
index.html: |
<!DOCTYPE html>
<html>
<head>
<title>Kubernetes Demo App</title>
<style>
body { font-family: Arial; text-align: center; margin-top: 50px; }
.container { max-width: 600px; margin: 0 auto; }
.success { color: #28a745; }
</style>
</head>
<body>
<div class="container">
<h1 class="success">๐ Kubernetes Deployment Successful!</h1>
<p>This page is served by a Kubernetes pod</p>
<p>Pod hostname: <span id="hostname"></span></p>
<script>
fetch('/api/hostname')
.then(r => r.text())
.then(h => document.getElementById('hostname').innerText = h)
.catch(() => document.getElementById('hostname').innerText = 'Unknown');
</script>
</div>
</body>
</html>
---
apiVersion: v1
kind: Service
metadata:
name: web-app-service
namespace: demo-app
spec:
selector:
app: web-app
ports:
- protocol: TCP
port: 80
targetPort: 80
nodePort: 30080
type: NodePort
EOF
# Deploy the application
kubectl apply -f web-app-deployment.yaml
Create a database deployment:
# Create database deployment
cat << EOF > database-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: mysql-db
namespace: demo-app
spec:
replicas: 1
selector:
matchLabels:
app: mysql-db
template:
metadata:
labels:
app: mysql-db
spec:
containers:
- name: mysql
image: mysql:8.0
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-secret
key: password
- name: MYSQL_DATABASE
value: "demoapp"
ports:
- containerPort: 3306
volumeMounts:
- name: mysql-storage
mountPath: /var/lib/mysql
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
volumes:
- name: mysql-storage
persistentVolumeClaim:
claimName: mysql-pvc
---
apiVersion: v1
kind: Secret
metadata:
name: mysql-secret
namespace: demo-app
type: Opaque
data:
password: UGFzc3dvcmQxMjMh # Password123! base64 encoded
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pvc
namespace: demo-app
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: Service
metadata:
name: mysql-service
namespace: demo-app
spec:
selector:
app: mysql-db
ports:
- protocol: TCP
port: 3306
targetPort: 3306
type: ClusterIP
EOF
# Deploy the database
kubectl apply -f database-deployment.yaml
Create application management script:
# Create application management script
sudo nano /usr/local/bin/k8s-apps.sh
#!/bin/bash
show_applications() {
echo "๐ฆ KUBERNETES APPLICATIONS"
echo "=========================="
echo ""
echo "๐ Deployments across all namespaces:"
kubectl get deployments --all-namespaces -o wide
echo ""
echo "๐ Services across all namespaces:"
kubectl get services --all-namespaces -o wide
echo ""
echo "๐ Pods across all namespaces:"
kubectl get pods --all-namespaces -o wide
echo ""
echo "๐พ Persistent Volume Claims:"
kubectl get pvc --all-namespaces
}
scale_deployment() {
local deployment="$1"
local namespace="$2"
local replicas="$3"
if [ -z "$deployment" ] || [ -z "$namespace" ] || [ -z "$replicas" ]; then
echo "Usage: $0 scale <deployment> <namespace> <replicas>"
return 1
fi
echo "๐ SCALING DEPLOYMENT"
echo "===================="
echo "Deployment: $deployment"
echo "Namespace: $namespace"
echo "Target replicas: $replicas"
echo ""
kubectl scale deployment $deployment -n $namespace --replicas=$replicas
echo "โ
Scaling initiated"
echo "๐ Watching rollout status..."
kubectl rollout status deployment/$deployment -n $namespace
}
rolling_update() {
local deployment="$1"
local namespace="$2"
local image="$3"
if [ -z "$deployment" ] || [ -z "$namespace" ] || [ -z "$image" ]; then
echo "Usage: $0 update <deployment> <namespace> <new-image>"
return 1
fi
echo "๐ ROLLING UPDATE"
echo "================="
echo "Deployment: $deployment"
echo "Namespace: $namespace"
echo "New image: $image"
echo ""
kubectl set image deployment/$deployment -n $namespace *=$image
echo "โ
Update initiated"
echo "๐ Watching rollout status..."
kubectl rollout status deployment/$deployment -n $namespace
}
get_app_logs() {
local app_label="$1"
local namespace="$2"
if [ -z "$app_label" ] || [ -z "$namespace" ]; then
echo "Usage: $0 logs <app-label> <namespace>"
return 1
fi
echo "๐ APPLICATION LOGS"
echo "=================="
echo "App: $app_label"
echo "Namespace: $namespace"
echo ""
# Get logs from all pods with the app label
kubectl logs -l app=$app_label -n $namespace --tail=50
}
troubleshoot_app() {
local namespace="$1"
if [ -z "$namespace" ]; then
echo "Usage: $0 troubleshoot <namespace>"
return 1
fi
echo "๐ TROUBLESHOOTING NAMESPACE: $namespace"
echo "========================================"
echo ""
echo "๐ Pod Status:"
kubectl get pods -n $namespace -o wide
echo ""
echo "๐ Events:"
kubectl get events -n $namespace --sort-by='.lastTimestamp'
echo ""
echo "๐จ Failed Pods:"
kubectl get pods -n $namespace --field-selector=status.phase=Failed
echo ""
echo "โณ Pending Pods:"
kubectl get pods -n $namespace --field-selector=status.phase=Pending
}
case "$1" in
list)
show_applications
;;
scale)
scale_deployment "$2" "$3" "$4"
;;
update)
rolling_update "$2" "$3" "$4"
;;
logs)
get_app_logs "$2" "$3"
;;
troubleshoot)
troubleshoot_app "$2"
;;
*)
echo "Usage: $0 {list|scale|update|logs|troubleshoot}"
echo " list - Show all applications"
echo " scale - Scale deployment: scale <deployment> <namespace> <replicas>"
echo " update - Rolling update: update <deployment> <namespace> <image>"
echo " logs - Get app logs: logs <app-label> <namespace>"
echo " troubleshoot - Troubleshoot namespace: troubleshoot <namespace>"
;;
esac
# Make executable and test
sudo chmod +x /usr/local/bin/k8s-apps.sh
# View all applications
sudo /usr/local/bin/k8s-apps.sh list
# Test scaling
sudo /usr/local/bin/k8s-apps.sh scale web-app demo-app 5
๐ Step 5: Advanced Kubernetes Features
Letโs explore advanced Kubernetes capabilities! ๐ฏ Weโll set up ingress, monitoring, and automation.
Install and configure Ingress controller:
# Install NGINX Ingress Controller
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.8.1/deploy/static/provider/cloud/deploy.yaml
# Wait for ingress controller to be ready
kubectl wait --namespace ingress-nginx \
--for=condition=ready pod \
--selector=app.kubernetes.io/component=controller \
--timeout=120s
# Create ingress for our demo app
cat << EOF > web-app-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: web-app-ingress
namespace: demo-app
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
ingressClassName: nginx
rules:
- host: demo.local
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: web-app-service
port:
number: 80
EOF
kubectl apply -f web-app-ingress.yaml
Set up Helm package manager:
# Install Helm
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
# Add popular Helm repositories
helm repo add stable https://charts.helm.sh/stable
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# Install Prometheus and Grafana for monitoring
kubectl create namespace monitoring
helm install prometheus prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--set grafana.adminPassword=admin123 \
--set grafana.service.type=NodePort \
--set grafana.service.nodePort=30300
# Wait for monitoring stack to be ready
kubectl wait --namespace monitoring \
--for=condition=ready pod \
--selector=app.kubernetes.io/name=grafana \
--timeout=300s
Create Kubernetes automation script:
# Create automation and monitoring script
sudo nano /usr/local/bin/k8s-advanced.sh
#!/bin/bash
install_monitoring() {
echo "๐ INSTALLING MONITORING STACK"
echo "==============================="
# Create monitoring namespace
kubectl create namespace monitoring 2>/dev/null || true
# Install Prometheus operator
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm upgrade --install prometheus-stack prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--set grafana.adminPassword=admin123 \
--set grafana.service.type=NodePort \
--set grafana.service.nodePort=30300 \
--set prometheus.service.type=NodePort \
--set prometheus.service.nodePort=30900
echo "โ
Monitoring stack installed"
echo "๐ Grafana: http://node-ip:30300 (admin/admin123)"
echo "๐ Prometheus: http://node-ip:30900"
}
setup_autoscaling() {
echo "๐ SETTING UP HORIZONTAL POD AUTOSCALER"
echo "========================================"
# Create HPA for web app
cat << EOF | kubectl apply -f -
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-app-hpa
namespace: demo-app
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
EOF
echo "โ
Horizontal Pod Autoscaler configured"
echo "๐ Web app will scale between 2-10 replicas based on CPU/memory usage"
}
backup_cluster() {
echo "๐พ BACKING UP CLUSTER CONFIGURATION"
echo "==================================="
BACKUP_DIR="/storage/k8s-backups/$(date +%Y%m%d_%H%M%S)"
mkdir -p $BACKUP_DIR
# Backup cluster resources
echo "Backing up cluster resources..."
# Get all resources
kubectl get all --all-namespaces -o yaml > $BACKUP_DIR/all-resources.yaml
# Backup specific resource types
kubectl get configmaps --all-namespaces -o yaml > $BACKUP_DIR/configmaps.yaml
kubectl get secrets --all-namespaces -o yaml > $BACKUP_DIR/secrets.yaml
kubectl get persistentvolumes -o yaml > $BACKUP_DIR/persistent-volumes.yaml
kubectl get persistentvolumeclaims --all-namespaces -o yaml > $BACKUP_DIR/persistent-volume-claims.yaml
# Backup etcd (if accessible)
if command -v etcdctl &> /dev/null; then
ETCDCTL_API=3 etcdctl snapshot save $BACKUP_DIR/etcd-snapshot.db \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key
fi
# Compress backup
tar -czf $BACKUP_DIR.tar.gz -C $(dirname $BACKUP_DIR) $(basename $BACKUP_DIR)
rm -rf $BACKUP_DIR
echo "โ
Backup completed: $BACKUP_DIR.tar.gz"
}
stress_test() {
echo "๐งช RUNNING CLUSTER STRESS TEST"
echo "=============================="
# Deploy stress test application
cat << EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: stress-test
namespace: default
spec:
replicas: 5
selector:
matchLabels:
app: stress-test
template:
metadata:
labels:
app: stress-test
spec:
containers:
- name: stress
image: busybox
command: ["sh", "-c", "while true; do echo 'Stress test running'; sleep 1; done"]
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "200m"
memory: "256Mi"
EOF
echo "โ
Stress test deployed"
echo "๐ Monitor with: kubectl top pods"
echo "๐งน Cleanup with: kubectl delete deployment stress-test"
}
show_cluster_metrics() {
echo "๐ CLUSTER METRICS"
echo "=================="
echo ""
echo "๐ฅ๏ธ Node Resource Usage:"
kubectl top nodes
echo ""
echo "๐ฆ Pod Resource Usage (Top 10):"
kubectl top pods --all-namespaces | head -11
echo ""
echo "๐ Cluster Resource Summary:"
kubectl describe nodes | grep -E "(Name:|Allocatable:|Allocated resources:)" -A 5
}
case "$1" in
monitoring)
install_monitoring
;;
autoscale)
setup_autoscaling
;;
backup)
backup_cluster
;;
stress)
stress_test
;;
metrics)
show_cluster_metrics
;;
*)
echo "Usage: $0 {monitoring|autoscale|backup|stress|metrics}"
echo " monitoring - Install Prometheus and Grafana"
echo " autoscale - Set up horizontal pod autoscaling"
echo " backup - Backup cluster configuration"
echo " stress - Run cluster stress test"
echo " metrics - Show cluster resource metrics"
;;
esac
# Make executable
sudo chmod +x /usr/local/bin/k8s-advanced.sh
# Install monitoring
sudo /usr/local/bin/k8s-advanced.sh monitoring
# Set up autoscaling
sudo /usr/local/bin/k8s-advanced.sh autoscale
๐ฎ Quick Examples
Letโs see your Kubernetes cluster in action with practical examples! ๐ฏ
Example 1: Microservices Application
# Create a complete microservices application
cat << EOF > microservices-app.yaml
# Frontend Service
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontend
namespace: demo-app
spec:
replicas: 3
selector:
matchLabels:
app: frontend
template:
metadata:
labels:
app: frontend
spec:
containers:
- name: nginx
image: nginx:alpine
ports:
- containerPort: 80
env:
- name: BACKEND_URL
value: "http://backend-service:8080"
---
apiVersion: v1
kind: Service
metadata:
name: frontend-service
namespace: demo-app
spec:
selector:
app: frontend
ports:
- port: 80
targetPort: 80
type: LoadBalancer
---
# Backend API Service
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
namespace: demo-app
spec:
replicas: 2
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
spec:
containers:
- name: api
image: node:16-alpine
command: ["sh", "-c", "echo 'console.log(\"API Server running on port 8080\"); require(\"http\").createServer((req, res) => { res.writeHead(200, {\"Content-Type\": \"application/json\"}); res.end(JSON.stringify({status: \"OK\", timestamp: new Date().toISOString(), pod: process.env.HOSTNAME})); }).listen(8080);' > server.js && node server.js"]
ports:
- containerPort: 8080
env:
- name: DATABASE_URL
value: "mysql://mysql-service:3306/demoapp"
---
apiVersion: v1
kind: Service
metadata:
name: backend-service
namespace: demo-app
spec:
selector:
app: backend
ports:
- port: 8080
targetPort: 8080
type: ClusterIP
EOF
kubectl apply -f microservices-app.yaml
Example 2: Job and CronJob Examples
# Create batch processing jobs
cat << EOF > batch-jobs.yaml
# One-time Job
apiVersion: batch/v1
kind: Job
metadata:
name: data-processor
namespace: demo-app
spec:
completions: 3
parallelism: 2
template:
spec:
containers:
- name: processor
image: busybox
command: ["sh", "-c", "echo 'Processing data batch'; sleep 30; echo 'Batch complete'"]
restartPolicy: Never
backoffLimit: 3
---
# Scheduled CronJob
apiVersion: batch/v1
kind: CronJob
metadata:
name: daily-backup
namespace: demo-app
spec:
schedule: "0 2 * * *" # Daily at 2 AM
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: busybox
command: ["sh", "-c", "echo 'Running daily backup at $(date)'; sleep 10; echo 'Backup complete'"]
restartPolicy: OnFailure
EOF
kubectl apply -f batch-jobs.yaml
Example 3: Resource Monitoring Script
# Create resource monitoring script
sudo nano /usr/local/bin/k8s-monitor-resources.sh
#!/bin/bash
monitor_continuously() {
echo "๐ CONTINUOUS RESOURCE MONITORING"
echo "================================="
echo "Press Ctrl+C to stop"
echo ""
while true; do
clear
echo "๐ $(date)"
echo "===================="
echo ""
echo "๐ฅ๏ธ Node Resources:"
kubectl top nodes
echo ""
echo "๐ฆ Top Pod Resource Usage:"
kubectl top pods --all-namespaces --sort-by=cpu | head -10
echo ""
echo "๐ Pod Status Summary:"
kubectl get pods --all-namespaces | awk 'NR>1{print $4}' | sort | uniq -c
echo ""
echo "๐ Service Status:"
kubectl get services --all-namespaces | grep -v "ClusterIP.*<none>" | wc -l
echo "Total exposed services: $(kubectl get services --all-namespaces | grep -v "ClusterIP.*<none>" | wc -l)"
sleep 10
done
}
resource_alerts() {
echo "๐จ RESOURCE ALERTS"
echo "=================="
# Check for high CPU usage pods
echo "High CPU usage pods (>80%):"
kubectl top pods --all-namespaces | awk 'NR>1 && $3+0 > 80 {print $1, $2, $3}'
echo ""
echo "High memory usage pods (>1Gi):"
kubectl top pods --all-namespaces | awk 'NR>1 && $4 ~ /[0-9]+Gi/ {print $1, $2, $4}'
echo ""
echo "Failed or pending pods:"
kubectl get pods --all-namespaces | grep -E "(Failed|Pending|Error|CrashLoopBackOff)"
echo ""
echo "Nodes under pressure:"
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{": "}{.status.conditions[?(@.type=="MemoryPressure")].status}{" "}{.status.conditions[?(@.type=="DiskPressure")].status}{"\n"}{end}' | grep True
}
case "$1" in
continuous)
monitor_continuously
;;
alerts)
resource_alerts
;;
*)
echo "Usage: $0 {continuous|alerts}"
echo " continuous - Monitor resources continuously"
echo " alerts - Check for resource alerts"
;;
esac
# Make executable
sudo chmod +x /usr/local/bin/k8s-monitor-resources.sh
# Monitor resources
sudo /usr/local/bin/k8s-monitor-resources.sh alerts
๐จ Fix Common Problems
Donโt worry when Kubernetes issues arise โ here are solutions to common container orchestration problems! ๐ ๏ธ
Problem 1: Pods Stuck in Pending State
Symptoms: Pods remain in โPendingโ status, never start running
# Check why pods are pending
kubectl describe pod <pod-name> -n <namespace>
# Common causes and solutions:
# 1. Insufficient resources
kubectl describe nodes | grep -A5 "Allocated resources"
# 2. Node selector issues
kubectl get nodes --show-labels
# 3. Persistent volume issues
kubectl get pv,pvc --all-namespaces
# 4. Add more resources or remove resource constraints
kubectl patch deployment <deployment-name> -p '{"spec":{"template":{"spec":{"containers":[{"name":"container-name","resources":{"requests":{"cpu":"100m","memory":"128Mi"}}}]}}}}'
Problem 2: Service Not Accessible
Symptoms: Cannot reach services, connection refused errors
# Check service configuration
kubectl get services -o wide
# Verify endpoints
kubectl get endpoints
# Check if pods are ready
kubectl get pods -o wide
# Test service connectivity from within cluster
kubectl run debug --image=busybox --rm -it -- /bin/sh
# Inside the pod: wget -qO- http://service-name.namespace.svc.cluster.local
# Check network policies
kubectl get networkpolicies --all-namespaces
Problem 3: Node Not Ready
Symptoms: Nodes showing โNotReadyโ status
# Check node conditions
kubectl describe node <node-name>
# Common fixes:
# 1. Restart kubelet
sudo systemctl restart kubelet
# 2. Check container runtime
sudo systemctl status containerd
# 3. Check network connectivity
ping <master-node-ip>
# 4. Rejoin node to cluster
sudo kubeadm reset
# Then run the join command again
Problem 4: High Resource Usage
Symptoms: Cluster running slowly, resource exhaustion
# Identify resource-heavy workloads
kubectl top pods --all-namespaces --sort-by=memory
kubectl top pods --all-namespaces --sort-by=cpu
# Set resource limits
kubectl patch deployment <deployment> -p '{"spec":{"template":{"spec":{"containers":[{"name":"container","resources":{"limits":{"cpu":"500m","memory":"512Mi"}}}]}}}}'
# Scale down deployments
kubectl scale deployment <deployment> --replicas=1
# Clean up unused resources
kubectl delete pods --field-selector=status.phase=Succeeded
kubectl delete pods --field-selector=status.phase=Failed
๐ Simple Commands Summary
Hereโs your Kubernetes orchestration quick reference guide! ๐
Task | Command | Purpose |
---|---|---|
Cluster Status | kubectl cluster-info | Show cluster information |
Get Nodes | kubectl get nodes | List all nodes |
Get Pods | kubectl get pods --all-namespaces | List all pods |
Describe Resource | kubectl describe <resource> <name> | Get detailed resource info |
Apply Config | kubectl apply -f <file.yaml> | Deploy application |
Scale Deployment | kubectl scale deployment <name> --replicas=<number> | Scale application |
Get Logs | kubectl logs <pod-name> | View pod logs |
Execute Command | kubectl exec -it <pod> -- /bin/bash | Access pod shell |
Port Forward | kubectl port-forward <pod> <local-port>:<pod-port> | Forward ports |
Delete Resource | kubectl delete <resource> <name> | Remove resource |
Top Resources | kubectl top nodes/pods | Show resource usage |
Cluster Scripts | sudo /usr/local/bin/k8s-cluster.sh status | Custom cluster management |
๐ก Tips for Success
Follow these expert strategies to master Kubernetes orchestration! ๐
๐ฏ Cluster Architecture Best Practices
- Plan your cluster size โ Start with 3 master nodes for high availability
- Use proper resource limits โ Prevent resource starvation with requests and limits
- Implement monitoring โ Prometheus and Grafana are essential for production
- Regular backups โ Backup etcd and application configurations regularly
๐ง Application Deployment Strategies
- Use declarative configuration โ YAML files are better than imperative commands
- Implement health checks โ Liveness and readiness probes prevent issues
- Rolling deployments โ Update applications without downtime
- Environment separation โ Use namespaces to isolate environments
๐ก๏ธ Security and Networking
- Network policies โ Control traffic between pods and namespaces
- RBAC configuration โ Implement role-based access control
- Secret management โ Never put passwords in container images
- Pod security policies โ Enforce security standards across the cluster
๐ Advanced Operations
- Horizontal pod autoscaling โ Automatically scale based on metrics
- Cluster autoscaling โ Add/remove nodes based on demand
- Service mesh โ Consider Istio for advanced networking features
- GitOps workflows โ Use tools like ArgoCD for automated deployments
๐ What You Learned
Congratulations! Youโve mastered Kubernetes container orchestration on AlmaLinux! ๐ Hereโs your incredible achievement:
โ Built a production-ready Kubernetes cluster from scratch with multiple nodes โ Mastered container deployment with pods, services, and ingress controllers โ Implemented high availability with replica sets and load balancing โ Set up advanced monitoring with Prometheus and Grafana dashboards โ Configured autoscaling for both horizontal pod and cluster scaling โ Created comprehensive management scripts for cluster operations โ Deployed complex applications with microservices architecture โ Implemented backup and recovery procedures for cluster resilience โ Mastered troubleshooting techniques for common Kubernetes issues โ Built monitoring and alerting systems for proactive cluster management
๐ฏ Why This Matters
Kubernetes expertise is pure gold in todayโs containerized world! ๐ฐ
Every major company is moving to container orchestration for scalability, reliability, and efficiency. From Netflix running thousands of microservices to banks processing millions of transactions, Kubernetes powers the infrastructure that keeps the digital world running.
These skills open doors to the highest-paying DevOps, Cloud Engineer, and Site Reliability Engineer positions. Companies desperately need Kubernetes experts who can design, deploy, and manage container orchestration platforms that scale from startup to enterprise.
Remember, youโve not just learned a technology โ youโve mastered the future of application deployment and management. Kubernetes is the foundation that enables everything from AI/ML workloads to global-scale web applications.
Keep orchestrating, keep scaling, and keep pushing the boundaries of whatโs possible with containers! Your expertise will power the next generation of cloud-native applications! ๐ขโก๐