🆘 AlmaLinux Disaster Recovery & Backup: Complete Business Continuity Guide
Welcome to the ultimate guide for protecting your AlmaLinux systems from disasters! 🛡️ Whether you’re safeguarding against hardware failures, natural disasters, or cyber attacks, this comprehensive guide will transform you into a disaster recovery expert who can ensure your business never misses a beat! 🎯
Disasters don’t schedule appointments, but with proper backup and recovery strategies, you’ll sleep soundly knowing your systems can bounce back from anything. Let’s build an unbreakable safety net for your data and services! 💪
🤔 Why is Disaster Recovery & Backup Critical?
Imagine losing years of work in seconds – that’s the nightmare disaster recovery prevents! 💥 Here’s why mastering backup and DR on AlmaLinux is absolutely essential:
- 🔥 Data Protection - Safeguard against ransomware and corruption
- ⚡ Business Continuity - Keep operations running during disasters
- 💰 Cost Prevention - Avoid millions in downtime losses
- 🌪️ Natural Disaster Recovery - Survive floods, fires, and earthquakes
- 🚀 Rapid Recovery - Get back online in minutes, not days
- 📊 Compliance Requirements - Meet industry data retention standards
- 🔄 Version Control - Restore to any point in time
- 🛡️ Multi-Layer Protection - Defense against every type of failure
🎯 What You Need
Let’s prepare your comprehensive disaster recovery arsenal! ✅
System Requirements:
- ✅ AlmaLinux 8.x or 9.x production systems
- ✅ Separate backup storage (local and offsite)
- ✅ Network connectivity for remote backups
- ✅ Root access for system-level backups
- ✅ Sufficient storage for multiple restore points
Backup Infrastructure:
- ✅ Local backup storage (NAS/SAN)
- ✅ Cloud storage accounts (AWS S3, Azure, Google Cloud)
- ✅ Network bandwidth for large transfers
- ✅ Backup scheduling and automation tools
- ✅ Monitoring and alerting systems
📝 Comprehensive Backup Strategy
Let’s build a multi-layered backup system that protects everything! 🔧
File-Level Backup with Rsync
# Install backup tools
sudo dnf install -y rsync rclone duplicity borgbackup
# Create backup directories
sudo mkdir -p /backup/{daily,weekly,monthly,system}
sudo mkdir -p /backup/logs
# Basic rsync backup script
cat > /usr/local/bin/backup-files.sh << 'EOF'
#!/bin/bash
# Comprehensive file backup script
# Configuration
BACKUP_ROOT="/backup"
SOURCE_DIRS=("/etc" "/home" "/var/www" "/opt" "/usr/local")
EXCLUDE_FILE="/etc/backup-exclude.txt"
LOG_FILE="/backup/logs/backup-$(date +%Y%m%d_%H%M%S).log"
RETENTION_DAYS=30
# Logging function
log() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE"
}
# Create exclude file
cat > "$EXCLUDE_FILE" << 'EXCLUDE'
/tmp/*
/var/tmp/*
/var/cache/*
/var/log/*
/proc/*
/sys/*
/dev/*
/run/*
/mnt/*
/media/*
*.tmp
*.log
*.cache
EXCLUDE
# Main backup function
backup_files() {
local backup_date=$(date +%Y%m%d_%H%M%S)
local backup_dir="$BACKUP_ROOT/daily/$backup_date"
log "Starting file backup to $backup_dir"
mkdir -p "$backup_dir"
for dir in "${SOURCE_DIRS[@]}"; do
if [ -d "$dir" ]; then
log "Backing up $dir"
rsync -avz --delete \
--exclude-from="$EXCLUDE_FILE" \
"$dir/" "$backup_dir/$(basename $dir)/" \
2>&1 | tee -a "$LOG_FILE"
fi
done
# Create backup manifest
find "$backup_dir" -type f | wc -l > "$backup_dir/file_count.txt"
du -sh "$backup_dir" > "$backup_dir/size.txt"
log "Backup completed successfully"
}
# Cleanup old backups
cleanup_old_backups() {
log "Cleaning up backups older than $RETENTION_DAYS days"
find "$BACKUP_ROOT/daily" -type d -mtime +$RETENTION_DAYS -exec rm -rf {} \; 2>/dev/null
}
# Main execution
log "=== Starting backup process ==="
backup_files
cleanup_old_backups
log "=== Backup process completed ==="
# Calculate backup statistics
total_backups=$(ls -1 "$BACKUP_ROOT/daily" | wc -l)
total_size=$(du -sh "$BACKUP_ROOT/daily" | cut -f1)
log "Statistics: $total_backups backups, Total size: $total_size"
EOF
chmod +x /usr/local/bin/backup-files.sh
# Test the backup
sudo /usr/local/bin/backup-files.sh
System Image Backup with dd and CloneZilla
# Create system image backup script
cat > /usr/local/bin/backup-system-image.sh << 'EOF'
#!/bin/bash
# System image backup using dd
BACKUP_DIR="/backup/system"
IMAGE_NAME="almalinux-system-$(date +%Y%m%d_%H%M%S).img"
COMPRESSION="gzip"
LOG_FILE="/backup/logs/system-backup-$(date +%Y%m%d_%H%M%S).log"
# Logging function
log() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE"
}
# Create system image
create_system_image() {
log "Starting system image backup"
mkdir -p "$BACKUP_DIR"
# Determine root device
ROOT_DEVICE=$(df / | tail -1 | awk '{print $1}' | sed 's/[0-9]*$//')
log "Root device detected: $ROOT_DEVICE"
# Create compressed image
log "Creating compressed system image: $IMAGE_NAME"
if command -v pv >/dev/null 2>&1; then
# With progress bar
dd if="$ROOT_DEVICE" bs=4M | pv | gzip > "$BACKUP_DIR/$IMAGE_NAME.gz"
else
# Without progress bar
dd if="$ROOT_DEVICE" bs=4M | gzip > "$BACKUP_DIR/$IMAGE_NAME.gz"
fi
# Verify image
if [ -f "$BACKUP_DIR/$IMAGE_NAME.gz" ]; then
image_size=$(du -h "$BACKUP_DIR/$IMAGE_NAME.gz" | cut -f1)
log "System image created successfully: $image_size"
# Create checksum
md5sum "$BACKUP_DIR/$IMAGE_NAME.gz" > "$BACKUP_DIR/$IMAGE_NAME.gz.md5"
log "Checksum created for verification"
else
log "ERROR: System image creation failed"
exit 1
fi
}
# Partition table backup
backup_partition_table() {
log "Backing up partition table"
sfdisk -d "$ROOT_DEVICE" > "$BACKUP_DIR/partition-table-$(date +%Y%m%d).txt"
blkid > "$BACKUP_DIR/blkid-$(date +%Y%m%d).txt"
}
# Main execution
log "=== Starting system image backup ==="
backup_partition_table
create_system_image
log "=== System image backup completed ==="
EOF
chmod +x /usr/local/bin/backup-system-image.sh
# Install additional tools for system imaging
sudo dnf install -y pv gzip pigz
🔧 Database Backup Automation
Let’s create automated database backup strategies! 🗄️
MySQL/MariaDB Backup
# Create database backup script
cat > /usr/local/bin/backup-mysql.sh << 'EOF'
#!/bin/bash
# MySQL/MariaDB backup script
# Configuration
DB_USER="backup_user"
DB_PASS="BackupPassword123"
BACKUP_DIR="/backup/mysql"
RETENTION_DAYS=7
LOG_FILE="/backup/logs/mysql-backup-$(date +%Y%m%d_%H%M%S).log"
# Logging function
log() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE"
}
# Create backup user (run once)
create_backup_user() {
mysql -u root -p << 'SQL'
CREATE USER 'backup_user'@'localhost' IDENTIFIED BY 'BackupPassword123';
GRANT SELECT, LOCK TABLES, SHOW VIEW, EVENT, TRIGGER, RELOAD ON *.* TO 'backup_user'@'localhost';
FLUSH PRIVILEGES;
SQL
}
# Backup all databases
backup_all_databases() {
local backup_date=$(date +%Y%m%d_%H%M%S)
local backup_file="$BACKUP_DIR/all-databases-$backup_date.sql.gz"
log "Starting MySQL backup"
mkdir -p "$BACKUP_DIR"
# Create compressed backup
mysqldump -u "$DB_USER" -p"$DB_PASS" \
--all-databases \
--single-transaction \
--routines \
--triggers \
--events \
--master-data=2 \
--flush-logs \
--hex-blob \
--default-character-set=utf8mb4 | gzip > "$backup_file"
if [ ${PIPESTATUS[0]} -eq 0 ]; then
log "MySQL backup completed: $(basename $backup_file)"
# Create verification info
backup_size=$(du -h "$backup_file" | cut -f1)
db_count=$(gunzip -c "$backup_file" | grep "CREATE DATABASE" | wc -l)
log "Backup size: $backup_size, Databases: $db_count"
# Test restore (just structure)
gunzip -c "$backup_file" | head -100 > "$BACKUP_DIR/test-restore-$backup_date.sql"
else
log "ERROR: MySQL backup failed"
return 1
fi
}
# Backup individual databases
backup_individual_databases() {
local backup_date=$(date +%Y%m%d_%H%M%S)
local individual_dir="$BACKUP_DIR/individual/$backup_date"
mkdir -p "$individual_dir"
# Get list of databases
databases=$(mysql -u "$DB_USER" -p"$DB_PASS" -e "SHOW DATABASES;" | grep -v "Database\|information_schema\|performance_schema\|mysql\|sys")
for db in $databases; do
log "Backing up database: $db"
mysqldump -u "$DB_USER" -p"$DB_PASS" \
--single-transaction \
--routines \
--triggers \
--events \
"$db" | gzip > "$individual_dir/$db.sql.gz"
done
}
# Cleanup old backups
cleanup_old_backups() {
log "Cleaning up MySQL backups older than $RETENTION_DAYS days"
find "$BACKUP_DIR" -name "*.sql.gz" -mtime +$RETENTION_DAYS -delete
find "$BACKUP_DIR/individual" -type d -mtime +$RETENTION_DAYS -exec rm -rf {} \; 2>/dev/null
}
# Main execution
log "=== Starting MySQL backup process ==="
backup_all_databases
backup_individual_databases
cleanup_old_backups
log "=== MySQL backup process completed ==="
EOF
chmod +x /usr/local/bin/backup-mysql.sh
PostgreSQL Backup
# Create PostgreSQL backup script
cat > /usr/local/bin/backup-postgresql.sh << 'EOF'
#!/bin/bash
# PostgreSQL backup script
# Configuration
PG_USER="postgres"
BACKUP_DIR="/backup/postgresql"
RETENTION_DAYS=7
LOG_FILE="/backup/logs/postgresql-backup-$(date +%Y%m%d_%H%M%S).log"
# Logging function
log() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE"
}
# Backup all databases
backup_all_databases() {
local backup_date=$(date +%Y%m%d_%H%M%S)
local backup_file="$BACKUP_DIR/all-databases-$backup_date.sql.gz"
log "Starting PostgreSQL backup"
mkdir -p "$BACKUP_DIR"
# Create compressed backup of all databases
sudo -u "$PG_USER" pg_dumpall | gzip > "$backup_file"
if [ ${PIPESTATUS[0]} -eq 0 ]; then
log "PostgreSQL backup completed: $(basename $backup_file)"
# Create backup info
backup_size=$(du -h "$backup_file" | cut -f1)
log "Backup size: $backup_size"
else
log "ERROR: PostgreSQL backup failed"
return 1
fi
}
# Backup individual databases
backup_individual_databases() {
local backup_date=$(date +%Y%m%d_%H%M%S)
local individual_dir="$BACKUP_DIR/individual/$backup_date"
mkdir -p "$individual_dir"
# Get list of databases
databases=$(sudo -u "$PG_USER" psql -t -c "SELECT datname FROM pg_database WHERE datistemplate = false;" | grep -v "^$")
for db in $databases; do
db=$(echo $db | xargs) # Trim whitespace
log "Backing up database: $db"
sudo -u "$PG_USER" pg_dump "$db" | gzip > "$individual_dir/$db.sql.gz"
done
}
# Main execution
log "=== Starting PostgreSQL backup process ==="
backup_all_databases
backup_individual_databases
log "=== PostgreSQL backup process completed ==="
EOF
chmod +x /usr/local/bin/backup-postgresql.sh
🌟 Cloud Backup Integration
Let’s integrate with cloud storage for offsite protection! ☁️
AWS S3 Backup with Rclone
# Install and configure rclone
curl https://rclone.org/install.sh | sudo bash
# Configure AWS S3 (interactive)
rclone config
# Create cloud backup script
cat > /usr/local/bin/backup-to-cloud.sh << 'EOF'
#!/bin/bash
# Cloud backup using rclone
# Configuration
LOCAL_BACKUP="/backup"
CLOUD_REMOTE="aws-s3:my-backup-bucket"
LOG_FILE="/backup/logs/cloud-backup-$(date +%Y%m%d_%H%M%S).log"
ENCRYPTION_PASSWORD="MySecureBackupPassword123"
# Logging function
log() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE"
}
# Encrypt and upload to cloud
backup_to_cloud() {
log "Starting cloud backup"
# Create encrypted archive of recent backups
local archive_name="almalinux-backup-$(date +%Y%m%d_%H%M%S).tar.gz.enc"
local temp_archive="/tmp/backup-temp.tar.gz"
# Create compressed archive
log "Creating backup archive"
tar -czf "$temp_archive" -C "$LOCAL_BACKUP" daily mysql postgresql
# Encrypt archive
log "Encrypting backup archive"
openssl enc -aes-256-cbc -salt -in "$temp_archive" -out "/tmp/$archive_name" -pass pass:"$ENCRYPTION_PASSWORD"
# Upload to cloud
log "Uploading to cloud storage"
rclone copy "/tmp/$archive_name" "$CLOUD_REMOTE/$(date +%Y%m)/"
if [ $? -eq 0 ]; then
log "Cloud backup completed successfully"
# Verify upload
cloud_size=$(rclone size "$CLOUD_REMOTE/$(date +%Y%m)/$archive_name" | grep "Total size" | awk '{print $3,$4}')
log "Cloud backup size: $cloud_size"
else
log "ERROR: Cloud backup failed"
return 1
fi
# Cleanup temp files
rm -f "$temp_archive" "/tmp/$archive_name"
}
# Sync latest backups
sync_to_cloud() {
log "Syncing latest backups to cloud"
# Sync recent daily backups
rclone sync "$LOCAL_BACKUP/daily" "$CLOUD_REMOTE/daily" \
--filter "- *.tmp" \
--filter "- *.log" \
--max-age 7d
# Sync database backups
rclone sync "$LOCAL_BACKUP/mysql" "$CLOUD_REMOTE/mysql" --max-age 3d
rclone sync "$LOCAL_BACKUP/postgresql" "$CLOUD_REMOTE/postgresql" --max-age 3d
}
# Cleanup old cloud backups
cleanup_cloud_backups() {
log "Cleaning up old cloud backups"
# Keep only last 3 months of monthly archives
rclone delete "$CLOUD_REMOTE" --filter "+ *.tar.gz.enc" --filter "- *" --min-age 90d
}
# Main execution
log "=== Starting cloud backup process ==="
backup_to_cloud
sync_to_cloud
cleanup_cloud_backups
log "=== Cloud backup process completed ==="
EOF
chmod +x /usr/local/bin/backup-to-cloud.sh
✅ Disaster Recovery Planning
Let’s create comprehensive recovery procedures! 🚨
Recovery Documentation
# Create disaster recovery playbook
cat > /opt/disaster-recovery-playbook.md << 'EOF'
# AlmaLinux Disaster Recovery Playbook
## Emergency Contacts
- System Administrator: [Your Phone]
- Network Team: [Phone]
- Management: [Phone]
- Vendor Support: [Phone]
## Recovery Scenarios
### Scenario 1: Single Disk Failure
1. Boot from rescue media
2. Replace failed disk
3. Restore from latest system image
4. Verify system functionality
### Scenario 2: Complete System Loss
1. Provision new hardware
2. Install base AlmaLinux
3. Restore system image backup
4. Restore database backups
5. Verify application functionality
### Scenario 3: Data Corruption
1. Stop affected services
2. Identify corruption scope
3. Restore from last known good backup
4. Verify data integrity
5. Restart services
## Recovery Steps Detail
### System Image Restore
```bash
# Boot from rescue media
# Partition new disk to match original
sfdisk /dev/sda < /backup/system/partition-table-YYYYMMDD.txt
# Restore system image
gunzip -c /backup/system/almalinux-system-YYYYMMDD_HHMMSS.img.gz | dd of=/dev/sda bs=4M
# Verify checksum
md5sum /dev/sda > /tmp/restored.md5
# Compare with original checksum
Database Restore
# MySQL restore
gunzip -c /backup/mysql/all-databases-YYYYMMDD_HHMMSS.sql.gz | mysql -u root -p
# PostgreSQL restore
gunzip -c /backup/postgresql/all-databases-YYYYMMDD_HHMMSS.sql.gz | sudo -u postgres psql
Testing Schedule
- Monthly: Backup verification
- Quarterly: Partial recovery test
- Annually: Full disaster recovery test EOF
Create recovery testing script
cat > /usr/local/bin/test-recovery.sh << ‘EOF’ #!/bin/bash
Disaster recovery testing script
LOG_FILE=“/backup/logs/recovery-test-$(date +%Y%m%d_%H%M%S).log”
log() { echo ”$(date ’+%Y-%m-%d %H:%M:%S’) - $1” | tee -a “$LOG_FILE” }
Test backup integrity
test_backup_integrity() { log “Testing backup integrity”
# Test latest file backup
latest_backup=$(ls -t /backup/daily | head -1)
if [ -d "/backup/daily/$latest_backup" ]; then
file_count=$(find "/backup/daily/$latest_backup" -type f | wc -l)
log "Latest backup contains $file_count files"
fi
# Test database backup
if [ -f "/backup/mysql/all-databases-$(date +%Y%m%d)*.sql.gz" ]; then
log "MySQL backup found for today"
# Test if backup can be read
gunzip -t /backup/mysql/all-databases-$(date +%Y%m%d)*.sql.gz
if [ $? -eq 0 ]; then
log "MySQL backup integrity: PASS"
else
log "MySQL backup integrity: FAIL"
fi
fi
}
Test cloud connectivity
test_cloud_connectivity() { log “Testing cloud connectivity” rclone lsd aws-s3:my-backup-bucket >/dev/null 2>&1 if [ $? -eq 0 ]; then log “Cloud connectivity: PASS” else log “Cloud connectivity: FAIL” fi }
Main execution
log ”=== Starting recovery testing ===” test_backup_integrity test_cloud_connectivity log ”=== Recovery testing completed ===” EOF
chmod +x /usr/local/bin/test-recovery.sh
## 🎮 Quick Examples
### Example 1: Automated Backup Orchestration
```bash
# Create master backup orchestration script
cat > /usr/local/bin/master-backup.sh << 'EOF'
#!/bin/bash
# Master backup orchestration
LOG_FILE="/backup/logs/master-backup-$(date +%Y%m%d_%H%M%S).log"
log() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" | tee -a "$LOG_FILE"
}
# Pre-backup checks
pre_backup_checks() {
log "Performing pre-backup checks"
# Check disk space
backup_space=$(df /backup | tail -1 | awk '{print $4}')
if [ $backup_space -lt 10485760 ]; then # 10GB in KB
log "WARNING: Low disk space on backup volume"
fi
# Check system load
load_avg=$(uptime | awk -F'load average:' '{print $2}' | awk -F',' '{print $1}' | tr -d ' ')
log "Current system load: $load_avg"
}
# Execute backup sequence
run_backup_sequence() {
log "Starting backup sequence"
# Stop non-critical services to ensure consistency
log "Stopping non-critical services"
systemctl stop httpd nginx || true
# File system backup
log "Running file system backup"
/usr/local/bin/backup-files.sh
# Database backups
log "Running database backups"
/usr/local/bin/backup-mysql.sh
/usr/local/bin/backup-postgresql.sh
# Cloud backup (every 6 hours)
hour=$(date +%H)
if [ $((hour % 6)) -eq 0 ]; then
log "Running cloud backup"
/usr/local/bin/backup-to-cloud.sh
fi
# Restart services
log "Restarting services"
systemctl start httpd nginx || true
}
# Post-backup verification
post_backup_verification() {
log "Performing post-backup verification"
# Verify backup sizes
daily_size=$(du -sh /backup/daily | cut -f1)
mysql_size=$(du -sh /backup/mysql | cut -f1)
log "Backup sizes - Daily: $daily_size, MySQL: $mysql_size"
# Send notification
echo "Backup completed on $(hostname) at $(date). Daily: $daily_size, MySQL: $mysql_size" | \
mail -s "Backup Report - $(hostname)" [email protected]
}
# Main execution
log "=== Starting master backup process ==="
pre_backup_checks
run_backup_sequence
post_backup_verification
log "=== Master backup process completed ==="
EOF
chmod +x /usr/local/bin/master-backup.sh
# Schedule with cron
cat > /tmp/backup-cron << 'EOF'
# Master backup every 6 hours
0 */6 * * * /usr/local/bin/master-backup.sh
# System image backup weekly (Sundays at 2 AM)
0 2 * * 0 /usr/local/bin/backup-system-image.sh
# Recovery testing monthly
0 3 1 * * /usr/local/bin/test-recovery.sh
EOF
sudo crontab /tmp/backup-cron
Example 2: Rapid Recovery Script
# Create rapid recovery script for emergencies
cat > /usr/local/bin/rapid-recovery.sh << 'EOF'
#!/bin/bash
# Rapid recovery script for emergency situations
RECOVERY_TYPE="$1"
BACKUP_DATE="$2"
usage() {
echo "Usage: $0 [files|mysql|postgresql|system] [YYYYMMDD_HHMMSS]"
echo "Examples:"
echo " $0 files 20231218_143000"
echo " $0 mysql 20231218_143000"
echo " $0 system 20231218_143000"
exit 1
}
log() {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1"
}
# Recover files
recover_files() {
local backup_dir="/backup/daily/$BACKUP_DATE"
if [ ! -d "$backup_dir" ]; then
log "ERROR: Backup directory not found: $backup_dir"
exit 1
fi
log "Starting file recovery from $backup_dir"
# Create restore point
mkdir -p /recovery/pre-restore-$(date +%Y%m%d_%H%M%S)
# Restore critical directories
for dir in etc home var/www opt usr/local; do
if [ -d "$backup_dir/$dir" ]; then
log "Restoring /$dir"
rsync -av "$backup_dir/$dir/" "/$dir/"
fi
done
log "File recovery completed"
}
# Recover MySQL
recover_mysql() {
local backup_file="/backup/mysql/all-databases-${BACKUP_DATE}.sql.gz"
if [ ! -f "$backup_file" ]; then
# Try to find closest backup
backup_file=$(ls -t /backup/mysql/all-databases-*.sql.gz | head -1)
log "Using latest available backup: $(basename $backup_file)"
fi
log "Starting MySQL recovery"
# Stop MySQL service
systemctl stop mariadb mysql
# Backup current data
mv /var/lib/mysql /var/lib/mysql.backup.$(date +%Y%m%d_%H%M%S)
# Initialize new MySQL data directory
mysql_install_db --user=mysql
# Start MySQL
systemctl start mariadb mysql
# Restore data
gunzip -c "$backup_file" | mysql -u root -p
log "MySQL recovery completed"
}
# Main execution
if [ $# -lt 1 ]; then
usage
fi
case "$RECOVERY_TYPE" in
files)
recover_files
;;
mysql)
recover_mysql
;;
postgresql)
log "PostgreSQL recovery not implemented yet"
;;
system)
log "System recovery requires manual intervention"
log "Please refer to disaster recovery playbook"
;;
*)
usage
;;
esac
EOF
chmod +x /usr/local/bin/rapid-recovery.sh
Example 3: Backup Monitoring Dashboard
# Create backup monitoring script
cat > /usr/local/bin/backup-monitor.sh << 'EOF'
#!/bin/bash
# Backup monitoring dashboard
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
print_status() {
local status="$1"
local message="$2"
case "$status" in
"OK")
echo -e "${GREEN}✓${NC} $message"
;;
"WARNING")
echo -e "${YELLOW}⚠${NC} $message"
;;
"ERROR")
echo -e "${RED}✗${NC} $message"
;;
esac
}
# Check backup freshness
check_backup_freshness() {
echo "=== Backup Freshness Status ==="
# Check daily backups
latest_daily=$(ls -t /backup/daily 2>/dev/null | head -1)
if [ -n "$latest_daily" ]; then
daily_age=$((($(date +%s) - $(stat -c %Y "/backup/daily/$latest_daily")) / 3600))
if [ $daily_age -lt 24 ]; then
print_status "OK" "Daily backup: $daily_age hours old"
else
print_status "WARNING" "Daily backup: $daily_age hours old (stale)"
fi
else
print_status "ERROR" "No daily backups found"
fi
# Check MySQL backups
latest_mysql=$(ls -t /backup/mysql/*.sql.gz 2>/dev/null | head -1)
if [ -n "$latest_mysql" ]; then
mysql_age=$((($(date +%s) - $(stat -c %Y "$latest_mysql")) / 3600))
if [ $mysql_age -lt 24 ]; then
print_status "OK" "MySQL backup: $mysql_age hours old"
else
print_status "WARNING" "MySQL backup: $mysql_age hours old (stale)"
fi
else
print_status "ERROR" "No MySQL backups found"
fi
}
# Check backup sizes
check_backup_sizes() {
echo -e "\n=== Backup Size Information ==="
if [ -d "/backup/daily" ]; then
daily_size=$(du -sh /backup/daily 2>/dev/null | cut -f1)
daily_count=$(ls -1 /backup/daily | wc -l)
echo "Daily backups: $daily_size ($daily_count sets)"
fi
if [ -d "/backup/mysql" ]; then
mysql_size=$(du -sh /backup/mysql 2>/dev/null | cut -f1)
mysql_count=$(ls -1 /backup/mysql/*.sql.gz 2>/dev/null | wc -l)
echo "MySQL backups: $mysql_size ($mysql_count files)"
fi
if [ -d "/backup/system" ]; then
system_size=$(du -sh /backup/system 2>/dev/null | cut -f1)
system_count=$(ls -1 /backup/system/*.img.gz 2>/dev/null | wc -l)
echo "System images: $system_size ($system_count images)"
fi
}
# Check disk space
check_disk_space() {
echo -e "\n=== Backup Storage Status ==="
backup_usage=$(df /backup | tail -1 | awk '{print $5}' | sed 's/%//')
backup_avail=$(df -h /backup | tail -1 | awk '{print $4}')
if [ $backup_usage -lt 80 ]; then
print_status "OK" "Backup storage: ${backup_usage}% used, ${backup_avail} available"
elif [ $backup_usage -lt 90 ]; then
print_status "WARNING" "Backup storage: ${backup_usage}% used, ${backup_avail} available"
else
print_status "ERROR" "Backup storage: ${backup_usage}% used, ${backup_avail} available (critical)"
fi
}
# Check cloud connectivity
check_cloud_status() {
echo -e "\n=== Cloud Backup Status ==="
if command -v rclone >/dev/null 2>&1; then
if rclone lsd aws-s3:my-backup-bucket >/dev/null 2>&1; then
cloud_size=$(rclone size aws-s3:my-backup-bucket 2>/dev/null | grep "Total size" | awk '{print $3,$4}')
print_status "OK" "Cloud storage accessible, size: $cloud_size"
else
print_status "ERROR" "Cloud storage not accessible"
fi
else
print_status "WARNING" "rclone not installed - no cloud backup"
fi
}
# Main dashboard
echo "🛡️ AlmaLinux Backup Status Dashboard"
echo "Generated: $(date)"
echo "Host: $(hostname)"
echo ""
check_backup_freshness
check_backup_sizes
check_disk_space
check_cloud_status
echo -e "\n=== Recent Backup Logs ==="
if [ -d "/backup/logs" ]; then
latest_log=$(ls -t /backup/logs/*.log 2>/dev/null | head -1)
if [ -n "$latest_log" ]; then
echo "Latest log: $(basename $latest_log)"
tail -5 "$latest_log"
fi
fi
EOF
chmod +x /usr/local/bin/backup-monitor.sh
# Create daily monitoring report
echo "0 8 * * * /usr/local/bin/backup-monitor.sh | mail -s 'Daily Backup Report - $(hostname)' [email protected]" | sudo crontab -
🚨 Fix Common Backup Problems
Let’s solve frequent backup and recovery issues! 🛠️
Problem 1: Backup Jobs Failing
Symptoms: Backups not completing, error logs showing failures Solution:
# Check disk space
df -h /backup
# Check permissions
ls -la /backup
sudo chown -R root:root /backup
sudo chmod -R 755 /backup
# Check services
systemctl status crond
sudo systemctl restart crond
# Test backup scripts manually
sudo /usr/local/bin/backup-files.sh
Problem 2: Database Backup Corruption
Symptoms: Database backups cannot be restored Solution:
# Test backup integrity
gunzip -t /backup/mysql/backup.sql.gz
# Use single-transaction for consistency
mysqldump --single-transaction --routines --triggers --all-databases
# Verify restoration in test environment
mysql -u root -p test_db < backup.sql
Problem 3: Cloud Sync Failures
Symptoms: Files not uploading to cloud storage Solution:
# Test rclone connectivity
rclone lsd aws-s3:bucket-name
# Check credentials
rclone config show
# Test with verbose output
rclone copy /local/file aws-s3:bucket-name -v
# Check bandwidth limits
rclone config set aws-s3 bandwidth_limit 10M
Problem 4: Recovery Taking Too Long
Symptoms: Restore operations are extremely slow Solution:
# Use faster compression
pigz instead of gzip for parallel compression
# Optimize rsync
rsync -avz --compress-level=1 source/ destination/
# Use dd with larger block size
dd if=backup.img of=/dev/sda bs=16M
# Parallel restoration
parallel --jobs 4 restore_function ::: file1 file2 file3 file4
📋 Backup Commands Quick Reference
Essential backup and recovery commands! ⚡
Command | Purpose |
---|---|
rsync -avz --delete src/ dst/ | Incremental file backup |
mysqldump --all-databases | Complete MySQL backup |
pg_dumpall | Complete PostgreSQL backup |
dd if=/dev/sda bs=4M | System image backup |
tar -czf backup.tar.gz /data | Archive creation |
rclone sync local remote | Cloud synchronization |
gunzip -c backup.gz | mysql | Database restoration |
sfdisk -d /dev/sda > table.txt | Partition table backup |
💡 Disaster Recovery Best Practices
Master these DR best practices! 🎯
- 🔄 3-2-1 Rule - 3 copies, 2 different media, 1 offsite
- ⏰ Regular Testing - Test recovery procedures monthly
- 📝 Document Everything - Maintain updated recovery procedures
- 🚀 Automate Backups - Reduce human error with automation
- 🔐 Encrypt Sensitive Data - Protect backups with encryption
- 📊 Monitor Continuously - Alert on backup failures immediately
- 🎯 Define RPO/RTO - Set recovery objectives and measure them
- 🌍 Geographic Distribution - Protect against regional disasters
- 🔍 Verify Integrity - Regularly test backup restoration
- 📚 Train Your Team - Ensure multiple people can execute recovery
🏆 What You’ve Accomplished
Congratulations on mastering disaster recovery on AlmaLinux! 🎉 You’ve achieved:
- ✅ Comprehensive backup strategy with multiple layers
- ✅ Automated file system backups with retention
- ✅ Database backup automation for MySQL and PostgreSQL
- ✅ Cloud integration for offsite protection
- ✅ System image backups for complete recovery
- ✅ Recovery testing procedures to verify readiness
- ✅ Monitoring and alerting for backup health
- ✅ Rapid recovery scripts for emergency situations
- ✅ Documentation and playbooks for team knowledge
- ✅ Best practices implementation for enterprise-grade DR
🎯 Why These Skills Matter
Your disaster recovery expertise protects business continuity! 🌟 With these skills, you can:
Immediate Benefits:
- 🛡️ Protect against 99.9% of data loss scenarios
- ⚡ Recover from disasters in minutes, not days
- 💰 Save thousands in potential downtime costs
- 🔄 Ensure compliance with data protection regulations
Long-term Value:
- 🏆 Become the reliability expert organizations depend on
- 💼 Design enterprise disaster recovery strategies
- 🌍 Build resilient systems that survive any disaster
- 🚀 Enable businesses to operate fearlessly
You’re now equipped to protect organizations from their worst nightmares and turn disasters into minor inconveniences! Your backup and recovery systems are the invisible shields that keep businesses running no matter what! 🌟
Sleep well knowing your systems are bulletproof! 🙌