๐พ Rook Ceph Storage on AlmaLinux 9: Complete Guide
Ready to build bulletproof distributed storage? ๐ Today weโll deploy Rook Ceph on AlmaLinux 9, creating self-healing, self-scaling storage that never fails! Letโs orchestrate amazing storage! โจ๐ฏ
๐ค Why is Rook Ceph Important?
Imagine storage that manages itself like magic! ๐ Thatโs Rookโs superpower! Hereโs why itโs revolutionary:
- ๐ Self-Healing Storage - Automatically recovers from failures!
- ๐ฆ Multi-Protocol - Block, object, and file storage from one system
- ๐ Auto-Scaling - Grows with your needs automatically
- ๐ก๏ธ Data Protection - Multiple replicas ensure no data loss
- ๐ฏ Kubernetes Native - Storage as Kubernetes resources
- ๐ Built-in Monitoring - Dashboard and Prometheus metrics
- ๐ Production Ready - CNCF graduated project
- ๐ก No Manual Management - Operator handles everything
๐ฏ What You Need
Before we orchestrate storage magic, gather these:
- โ AlmaLinux 9 servers (3+ nodes, 8GB RAM each minimum)
- โ Kubernetes cluster 1.28+ (K3s, K8s, or any flavor)
- โ Raw disks or partitions (no filesystem)
- โ kubectl configured and working
- โ 10GB+ free disk per node for OSDs
- โ Network connectivity between nodes
- โ Root or sudo access
- โ Ready for storage revolution! ๐
๐ Step 1: Prepare AlmaLinux Nodes
Letโs prepare your systems for Ceph! ๐ ๏ธ
Install Prerequisites
# Update all nodes
sudo dnf update -y # Keep everything current
# Install required packages
sudo dnf install -y \
lvm2 \
python3 \
python3-pip \
kernel-modules-extra
# Load RBD kernel module
sudo modprobe rbd
echo "rbd" | sudo tee /etc/modules-load.d/rbd.conf
# Verify kernel modules
lsmod | grep rbd # Should show rbd module
# Check available disks
lsblk # Identify raw disks for Ceph
sudo fdisk -l # Detailed disk information
# Ensure disks are clean (WARNING: destroys data!)
# Replace /dev/sdb with your disk
sudo dd if=/dev/zero of=/dev/sdb bs=1M count=100 status=progress
sudo sgdisk --zap-all /dev/sdb
Prepare Kubernetes Cluster
# Label nodes for Ceph roles
kubectl label nodes node1 node2 node3 node-role.rook-ceph/cluster=true
# Create rook-ceph namespace
kubectl create namespace rook-ceph
# Verify nodes are ready
kubectl get nodes --show-labels
kubectl top nodes # Check resources
๐ง Step 2: Deploy Rook Operator
Time to install the Rook operator! ๐
Clone Rook Repository
# Clone specific version
git clone --single-branch --branch v1.14.0 https://github.com/rook/rook.git
cd rook/deploy/examples
# Or download manifests directly
ROOK_VERSION="v1.14.0"
wget https://raw.githubusercontent.com/rook/rook/${ROOK_VERSION}/deploy/examples/crds.yaml
wget https://raw.githubusercontent.com/rook/rook/${ROOK_VERSION}/deploy/examples/common.yaml
wget https://raw.githubusercontent.com/rook/rook/${ROOK_VERSION}/deploy/examples/operator.yaml
Deploy Rook Operator
# Deploy CRDs and common resources
kubectl create -f crds.yaml
kubectl create -f common.yaml
# Configure operator (optional customization)
cat <<EOF > operator-custom.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: rook-ceph-operator
namespace: rook-ceph
spec:
replicas: 1
selector:
matchLabels:
app: rook-ceph-operator
template:
spec:
containers:
- name: rook-ceph-operator
image: rook/ceph:v1.14.0
env:
# Enable discovery daemon
- name: ROOK_ENABLE_DISCOVERY_DAEMON
value: "true"
# Set log level
- name: ROOK_LOG_LEVEL
value: "INFO"
# Resource limits
resources:
limits:
memory: 512Mi
cpu: 500m
requests:
memory: 256Mi
cpu: 100m
EOF
# Deploy the operator
kubectl create -f operator.yaml
# Wait for operator to be ready
kubectl -n rook-ceph rollout status deployment/rook-ceph-operator
kubectl -n rook-ceph get pods # Should be Running
๐ Step 3: Create Ceph Cluster
Letโs deploy the Ceph storage cluster! ๐
Create Cluster Configuration
# Create production cluster config
cat <<EOF > cluster.yaml
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
cephVersion:
image: quay.io/ceph/ceph:v18.2.2 # Latest Reef version
allowUnsupported: false
dataDirHostPath: /var/lib/rook
skipUpgradeChecks: false
continueUpgradeAfterChecksEvenIfNotHealthy: false
mon:
count: 3 # Production should have 3 monitors
allowMultiplePerNode: false
mgr:
count: 2 # Active and standby
modules:
- name: rook
enabled: true
dashboard:
enabled: true
ssl: true
network:
provider: host # Use host network for performance
crashCollector:
disable: false
cleanupPolicy:
confirmation: ""
sanitizeDisks:
method: quick
dataSource: zero
iteration: 1
storage:
useAllNodes: true
useAllDevices: true
# deviceFilter: "^sd[b-z]" # Use specific disks
config:
osdsPerDevice: "1"
storeType: bluestore
monitoring:
enabled: true
metricsDisabled: false
resources:
mgr:
limits:
memory: "1Gi"
requests:
cpu: "500m"
memory: "512Mi"
mon:
limits:
memory: "2Gi"
requests:
cpu: "1000m"
memory: "1Gi"
osd:
limits:
memory: "4Gi"
requests:
cpu: "1000m"
memory: "2Gi"
EOF
# Deploy the cluster
kubectl create -f cluster.yaml
# Monitor cluster creation (this takes 5-10 minutes)
watch kubectl -n rook-ceph get cephcluster
kubectl -n rook-ceph get pods -w # Watch pods being created
Verify Cluster Health
# Deploy toolbox for Ceph commands
kubectl create -f toolbox.yaml
# Wait for toolbox
kubectl -n rook-ceph rollout status deployment/rook-ceph-tools
# Check cluster status
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph status
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd status
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph df
โ Step 4: Configure Storage Classes
Letโs create storage for applications! ๐ฆ
Create Block Storage
# Create RBD storage pool
cat <<EOF | kubectl apply -f -
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
name: replicapool
namespace: rook-ceph
spec:
failureDomain: host
replicated:
size: 3 # Number of replicas
requireSafeReplicaSize: true
EOF
# Create StorageClass
cat <<EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
clusterID: rook-ceph
pool: replicapool
imageFormat: "2"
imageFeatures: layering
csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
csi.storage.k8s.io/fstype: ext4
allowVolumeExpansion: true
reclaimPolicy: Delete
EOF
# Test with a PVC
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: rook-ceph-block
resources:
requests:
storage: 5Gi
EOF
# Check PVC is bound
kubectl get pvc test-pvc # Should be Bound
Create Object Storage
# Create object store
cat <<EOF | kubectl apply -f -
apiVersion: ceph.rook.io/v1
kind: CephObjectStore
metadata:
name: my-store
namespace: rook-ceph
spec:
metadataPool:
failureDomain: host
replicated:
size: 3
dataPool:
failureDomain: host
replicated:
size: 3
preservePoolsOnDelete: false
gateway:
type: s3
port: 80
instances: 2
resources:
limits:
memory: "2Gi"
requests:
cpu: "1000m"
memory: "1Gi"
EOF
# Create StorageClass for object storage
cat <<EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-object
provisioner: rook-ceph.ceph.rook.io/bucket
parameters:
objectStoreName: my-store
objectStoreNamespace: rook-ceph
reclaimPolicy: Delete
EOF
๐ฎ Quick Examples
Letโs explore Rookโs amazing features! ๐ฌ
Example 1: Deploy Application with Storage
# Deploy PostgreSQL with Ceph storage
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: rook-ceph-block
resources:
requests:
storage: 10Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
spec:
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:15
env:
- name: POSTGRES_PASSWORD
value: "secretpassword"
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
ports:
- containerPort: 5432
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data
volumes:
- name: postgres-storage
persistentVolumeClaim:
claimName: postgres-pvc
EOF
# Check deployment
kubectl get pods | grep postgres # Should be Running
kubectl exec -it deploy/postgres -- df -h # See Ceph mount
Example 2: Access Ceph Dashboard
# Get dashboard password
kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode
# Port-forward dashboard
kubectl port-forward -n rook-ceph svc/rook-ceph-mgr-dashboard 8443:8443 &
# Access dashboard
echo "๐จ Dashboard: https://localhost:8443"
echo "Username: admin"
echo "Password: (from above command)"
Example 3: Create S3 User
# Create object store user
cat <<EOF | kubectl apply -f -
apiVersion: ceph.rook.io/v1
kind: CephObjectStoreUser
metadata:
name: my-user
namespace: rook-ceph
spec:
store: my-store
displayName: "S3 User"
quotas:
maxBuckets: 100
maxSize: "10G"
EOF
# Get S3 credentials
kubectl -n rook-ceph get secret rook-ceph-object-user-my-store-my-user -o yaml
# Access S3 endpoint
kubectl -n rook-ceph get svc rook-ceph-rgw-my-store
๐จ Fix Common Problems
Donโt panic! Here are solutions! ๐ช
Problem 1: Cluster Not Healthy
# Check cluster status
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph health detail
# Check OSD status
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd tree
# Restart unhealthy OSDs
kubectl -n rook-ceph delete pod -l app=rook-ceph-osd
# Check mon quorum
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph mon stat
Problem 2: PVC Not Binding
# Check CSI pods
kubectl -n rook-ceph get pods | grep csi
# Check provisioner logs
kubectl -n rook-ceph logs -l app=csi-rbdplugin-provisioner
# Verify storage pool
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd pool ls
# Check StorageClass
kubectl describe storageclass rook-ceph-block
Problem 3: Slow Performance
# Check cluster performance
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd perf
# Enable caching
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- \
ceph osd pool set replicapool cache_target_dirty_ratio 0.4
# Check network latency
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- \
ceph osd ping
๐ Simple Commands Summary
Your Rook command toolkit! ๐
Command | What It Does | When to Use |
---|---|---|
kubectl create -f cluster.yaml | Deploy Ceph cluster | Initial setup |
ceph status | Check cluster health | Monitor health |
ceph osd tree | Show OSD topology | Check storage |
ceph df | Show storage usage | Monitor capacity |
kubectl get cephcluster -n rook-ceph | Cluster status | Quick check |
ceph osd pool ls | List storage pools | View pools |
ceph health detail | Detailed health | Troubleshoot |
kubectl get pvc | List volume claims | Check volumes |
ceph dashboard | Access web UI | Visual management |
ceph osd perf | Performance stats | Monitor speed |
๐ก Tips for Success
Master Rook with these pro tips! ๐
Storage Planning
- ๐พ Use dedicated disks for OSDs
- ๐ข Plan for 3x replication overhead
- ๐ Monitor disk usage regularly
- ๐ฏ Separate metadata and data pools
- โก Use SSDs for metadata pools
Performance Optimization
- ๐ Use host networking for speed
- ๐ก Enable RBD caching
- ๐ Tune OSD memory limits
- ๐ Balance PGs across OSDs
- ๐ฏ Use placement groups wisely
Best Practices
- ๐ก๏ธ Always maintain quorum (odd number of mons)
- ๐ Document your storage layout
- ๐ Monitor cluster metrics
- โ ๏ธ Set up alerts for health issues
- ๐พ Regular backup important data
- ๐ Enable encryption at rest
- ๐จ Use the dashboard for visibility
๐ What You Learned
Outstanding work! Youโre now a Rook expert! ๐ You can:
- โ Deploy Rook operator on AlmaLinux 9
- โ Create production Ceph clusters
- โ Configure block and object storage
- โ Deploy applications with persistent storage
- โ Access Ceph dashboard
- โ Monitor cluster health
- โ Troubleshoot common issues
- โ Optimize storage performance
๐ฏ Why This Matters
Youโve built enterprise-grade distributed storage! ๐ With Rook:
- Self-Managing - No more manual storage administration
- Highly Available - Survives node and disk failures
- Infinitely Scalable - Add nodes to grow storage
- Multi-Protocol - Block, object, and file from one system
- Cloud Native - Perfect for Kubernetes workloads
- Production Ready - Used by enterprises worldwide
- Cost Effective - Use commodity hardware
Your storage is now as resilient and scalable as cloud providers! No more storage bottlenecks, no more data loss fears. Everything is automated and self-healing.
Keep exploring features like CephFS for shared filesystems, RGW for S3 compatibility, and multi-site replication. Youโre running the same storage tech as major clouds! ๐
Remember: Great apps need great storage - Rook delivers excellence! Happy storing! ๐๐พ
P.S. - Join the Rook community, contribute to the project, and share your storage journey! Together weโre revolutionizing Kubernetes storage! โญ๐