🗄️ AlmaLinux Database Replication & Clustering: High Availability Guide

Welcome to the world of bulletproof database systems! 🚀 Whether you’re ensuring zero downtime, scaling for millions of users, or protecting critical data, this comprehensive guide will transform you into a database high availability expert who can build systems that never sleep! 💪

Database downtime can cost thousands per minute – but with proper replication and clustering, your databases will survive anything from hardware failures to entire datacenter outages! Let’s build unbreakable database infrastructure! 🛡️

🤔 Why is Database Replication & Clustering Important?

Imagine your database staying online even if half your servers explode – that’s the power of clustering! 💥 Here’s why mastering database HA on AlmaLinux is absolutely essential:

🚀 Zero Downtime - Keep services running 24/7/365
💾 Data Protection - Multiple copies prevent data loss
⚡ Load Balancing - Distribute reads across multiple servers
🌍 Geographic Distribution - Serve users from nearby locations
📈 Horizontal Scaling - Add nodes to handle more traffic
🔄 Automatic Failover - Instant recovery from failures
💰 Business Continuity - Avoid revenue loss from outages
🛡️ Disaster Recovery - Survive datacenter failures

🎯 What You Need

Let’s prepare your environment for database high availability! ✅

Hardware Requirements (Per Node):

✅ Minimum 3 AlmaLinux servers for true HA
✅ 4GB+ RAM per database node
✅ 50GB+ fast storage (SSD recommended)
✅ Gigabit network between nodes
✅ Static IP addresses for all nodes

Software We’ll Configure:

✅ MariaDB with Galera Cluster
✅ MySQL master-slave replication
✅ PostgreSQL streaming replication
✅ HAProxy for load balancing
✅ Keepalived for virtual IP

📝 Setting Up MariaDB Galera Cluster

Let’s build a multi-master database cluster that’s virtually indestructible! 🔧

Installing MariaDB and Galera

# On all nodes - Install MariaDB with Galera
sudo dnf install -y mariadb-server mariadb galera rsync

# Configure MariaDB for Galera (Node 1)
sudo tee /etc/my.cnf.d/galera.cnf << 'EOF'
[mysqld]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0
wsrep_on=ON

# Cluster configuration
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
wsrep_cluster_name="almalinux_cluster"
wsrep_cluster_address="gcomm://192.168.1.10,192.168.1.11,192.168.1.12"
wsrep_node_address="192.168.1.10"
wsrep_node_name="galera1"

# SST method
wsrep_sst_method=rsync

# Tuning
wsrep_slave_threads=4
innodb_flush_log_at_trx_commit=0
EOF

# Configure firewall on all nodes
sudo firewall-cmd --permanent --add-service=mysql
sudo firewall-cmd --permanent --add-port=4567/tcp  # Galera replication
sudo firewall-cmd --permanent --add-port=4568/tcp  # IST port
sudo firewall-cmd --permanent --add-port=4444/tcp  # SST port
sudo firewall-cmd --reload

# Bootstrap the cluster (Node 1 only)
sudo galera_new_cluster

# Start MariaDB on other nodes
# On Node 2 and 3:
sudo systemctl start mariadb

# Secure MariaDB installation
sudo mysql_secure_installation

Configuring Galera Nodes

# On Node 2 - Configure galera.cnf
sudo tee /etc/my.cnf.d/galera.cnf << 'EOF'
[mysqld]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
wsrep_cluster_name="almalinux_cluster"
wsrep_cluster_address="gcomm://192.168.1.10,192.168.1.11,192.168.1.12"
wsrep_node_address="192.168.1.11"
wsrep_node_name="galera2"
wsrep_sst_method=rsync
wsrep_slave_threads=4
innodb_flush_log_at_trx_commit=0
EOF

# On Node 3 - Configure galera.cnf
sudo tee /etc/my.cnf.d/galera.cnf << 'EOF'
[mysqld]
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0
wsrep_on=ON
wsrep_provider=/usr/lib64/galera-4/libgalera_smm.so
wsrep_cluster_name="almalinux_cluster"
wsrep_cluster_address="gcomm://192.168.1.10,192.168.1.11,192.168.1.12"
wsrep_node_address="192.168.1.12"
wsrep_node_name="galera3"
wsrep_sst_method=rsync
wsrep_slave_threads=4
innodb_flush_log_at_trx_commit=0
EOF

# Verify cluster status
mysql -u root -p -e "SHOW STATUS LIKE 'wsrep%';"

# Check cluster size (should be 3)
mysql -u root -p -e "SHOW STATUS LIKE 'wsrep_cluster_size';"

Testing Galera Replication

# Create test database on any node
mysql -u root -p << 'EOF'
CREATE DATABASE test_replication;
USE test_replication;
CREATE TABLE users (
    id INT AUTO_INCREMENT PRIMARY KEY,
    username VARCHAR(50),
    email VARCHAR(100),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
INSERT INTO users (username, email) VALUES
    ('alice', '[email protected]'),
    ('bob', '[email protected]');
EOF

# Verify on other nodes
mysql -u root -p -e "SELECT * FROM test_replication.users;"

# Create monitoring user
mysql -u root -p << 'EOF'
CREATE USER 'monitor'@'%' IDENTIFIED BY 'MonitorPass123';
GRANT USAGE, REPLICATION CLIENT ON *.* TO 'monitor'@'%';
FLUSH PRIVILEGES;
EOF

🔧 MySQL Master-Slave Replication

Let’s set up traditional MySQL replication for read scaling! 🌟

Configuring Master Server

# On Master server
sudo tee /etc/my.cnf.d/master.cnf << 'EOF'
[mysqld]
server-id = 1
log_bin = /var/log/mysql/mysql-bin
binlog_format = ROW
binlog_do_db = production_db
max_binlog_size = 100M
expire_logs_days = 7
EOF

# Create replication user
mysql -u root -p << 'EOF'
CREATE USER 'replicator'@'%' IDENTIFIED BY 'ReplicaPass123';
GRANT REPLICATION SLAVE ON *.* TO 'replicator'@'%';
FLUSH PRIVILEGES;
EOF

# Get master status
mysql -u root -p -e "SHOW MASTER STATUS;"
# Note the File and Position values!

# Backup database for initial sync
mysqldump -u root -p --all-databases --master-data > master_dump.sql
scp master_dump.sql slave-server:/tmp/

Configuring Slave Server

# On Slave server
sudo tee /etc/my.cnf.d/slave.cnf << 'EOF'
[mysqld]
server-id = 2
relay-log = /var/log/mysql/mysql-relay-bin
log_bin = /var/log/mysql/mysql-bin
binlog_format = ROW
read_only = 1
EOF

# Restore master dump
mysql -u root -p < /tmp/master_dump.sql

# Configure slave replication
mysql -u root -p << 'EOF'
STOP SLAVE;
RESET SLAVE;
CHANGE MASTER TO
    MASTER_HOST='192.168.1.10',
    MASTER_USER='replicator',
    MASTER_PASSWORD='ReplicaPass123',
    MASTER_LOG_FILE='mysql-bin.000001',
    MASTER_LOG_POS=154;
START SLAVE;
EOF

# Check slave status
mysql -u root -p -e "SHOW SLAVE STATUS\G"

# Verify replication is working
# IO_Running: Yes
# SQL_Running: Yes

🌟 PostgreSQL Streaming Replication

Let’s set up PostgreSQL with hot standby for high availability! 🐘

Configuring Primary PostgreSQL Server

# Install PostgreSQL on all nodes
sudo dnf install -y postgresql postgresql-server postgresql-contrib

# Initialize database (primary only)
sudo postgresql-setup --initdb

# Configure primary server
sudo tee -a /var/lib/pgsql/data/postgresql.conf << 'EOF'
# Replication settings
wal_level = replica
max_wal_senders = 3
wal_keep_segments = 64
archive_mode = on
archive_command = 'cp %p /var/lib/pgsql/archive/%f'
listen_addresses = '*'
EOF

# Create archive directory
sudo -u postgres mkdir -p /var/lib/pgsql/archive

# Configure authentication
sudo tee -a /var/lib/pgsql/data/pg_hba.conf << 'EOF'
# Replication connections
host    replication     replicator      192.168.1.0/24    md5
host    all            all             192.168.1.0/24    md5
EOF

# Start PostgreSQL
sudo systemctl enable --now postgresql

# Create replication user
sudo -u postgres psql << 'EOF'
CREATE USER replicator WITH REPLICATION LOGIN PASSWORD 'ReplicaPass123';
EOF

# Create base backup for standby
sudo -u postgres pg_basebackup -h localhost -D /tmp/standby -U replicator -W -v -P

Configuring Standby PostgreSQL Server

# On standby server
# Stop PostgreSQL if running
sudo systemctl stop postgresql

# Copy base backup
sudo rm -rf /var/lib/pgsql/data/*
sudo cp -R /tmp/standby/* /var/lib/pgsql/data/
sudo chown -R postgres:postgres /var/lib/pgsql/data

# Create standby signal file
sudo -u postgres touch /var/lib/pgsql/data/standby.signal

# Configure recovery settings
sudo -u postgres tee /var/lib/pgsql/data/postgresql.auto.conf << 'EOF'
primary_conninfo = 'host=192.168.1.10 port=5432 user=replicator password=ReplicaPass123'
restore_command = 'cp /var/lib/pgsql/archive/%f %p'
EOF

# Start standby server
sudo systemctl start postgresql

# Verify replication status on primary
sudo -u postgres psql -c "SELECT * FROM pg_stat_replication;"

# Check standby status
sudo -u postgres psql -c "SELECT pg_is_in_recovery();"

✅ Load Balancing with HAProxy

Let’s distribute database connections across multiple nodes! ⚖️

Installing and Configuring HAProxy

# Install HAProxy
sudo dnf install -y haproxy

# Configure HAProxy for database load balancing
sudo tee /etc/haproxy/haproxy.cfg << 'EOF'
global
    log 127.0.0.1 local0
    chroot /var/lib/haproxy
    pidfile /var/run/haproxy.pid
    maxconn 4000
    user haproxy
    group haproxy
    daemon

defaults
    mode tcp
    log global
    option tcplog
    option dontlognull
    option redispatch
    retries 3
    timeout queue 1m
    timeout connect 10s
    timeout client 1m
    timeout server 1m
    timeout check 10s
    maxconn 3000

# Statistics page
stats enable
stats uri /stats
stats realm HAProxy\ Statistics
stats auth admin:admin123

# MySQL Load Balancing
listen mysql-cluster
    bind *:3306
    mode tcp
    option mysql-check user monitor
    balance roundrobin
    server galera1 192.168.1.10:3306 check
    server galera2 192.168.1.11:3306 check
    server galera3 192.168.1.12:3306 check

# PostgreSQL Load Balancing
listen postgresql-cluster
    bind *:5432
    mode tcp
    option pgsql-check user replicator
    balance roundrobin
    server pg-primary 192.168.1.20:5432 check
    server pg-standby1 192.168.1.21:5432 check backup
    server pg-standby2 192.168.1.22:5432 check backup
EOF

# Start HAProxy
sudo systemctl enable --now haproxy

# Check HAProxy statistics
# Browse to: http://your-server-ip/stats

Virtual IP with Keepalived

# Install Keepalived on HAProxy servers
sudo dnf install -y keepalived

# Configure Keepalived (Master)
sudo tee /etc/keepalived/keepalived.conf << 'EOF'
vrrp_instance VI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass SecretPass
    }
    virtual_ipaddress {
        192.168.1.100/24
    }
}
EOF

# Configure Keepalived (Backup)
sudo tee /etc/keepalived/keepalived.conf << 'EOF'
vrrp_instance VI_1 {
    state BACKUP
    interface eth0
    virtual_router_id 51
    priority 90
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass SecretPass
    }
    virtual_ipaddress {
        192.168.1.100/24
    }
}
EOF

# Start Keepalived
sudo systemctl enable --now keepalived

# Verify virtual IP
ip addr show | grep 192.168.1.100

🎮 Quick Examples

Example 1: Automatic Failover Testing

# Create failover test script
cat > /usr/local/bin/test-failover.sh << 'EOF'
#!/bin/bash

echo "Testing Database Failover..."

# Test Galera failover
echo "Stopping one Galera node..."
ssh galera1 "sudo systemctl stop mariadb"
sleep 5

echo "Testing cluster availability..."
mysql -h 192.168.1.100 -u root -p -e "SHOW STATUS LIKE 'wsrep_cluster_size';"

echo "Restarting stopped node..."
ssh galera1 "sudo systemctl start mariadb"
sleep 10

echo "Verifying cluster recovery..."
mysql -h 192.168.1.100 -u root -p -e "SHOW STATUS LIKE 'wsrep_cluster_size';"

echo "Failover test complete!"
EOF

chmod +x /usr/local/bin/test-failover.sh

Example 2: Database Performance Monitoring

# Create monitoring script
cat > /usr/local/bin/monitor-databases.sh << 'EOF'
#!/bin/bash

echo "=== Database Cluster Status ==="
echo "Timestamp: $(date)"
echo ""

echo "=== Galera Cluster Status ==="
mysql -h 192.168.1.10 -u monitor -pMonitorPass123 -e "
SELECT
    VARIABLE_NAME,
    VARIABLE_VALUE
FROM information_schema.GLOBAL_STATUS
WHERE VARIABLE_NAME IN (
    'wsrep_cluster_size',
    'wsrep_cluster_status',
    'wsrep_connected',
    'wsrep_ready',
    'wsrep_local_state_comment'
);"

echo -e "\n=== PostgreSQL Replication Status ==="
sudo -u postgres psql -h 192.168.1.20 -c "
SELECT
    client_addr,
    state,
    sync_state,
    replay_lag
FROM pg_stat_replication;"

echo -e "\n=== HAProxy Statistics ==="
echo "show stat" | sudo socat stdio /var/run/haproxy/admin.sock | \
    cut -d',' -f1,2,18,19 | head -10

echo -e "\n=== Connection Statistics ==="
mysql -h 192.168.1.100 -u monitor -pMonitorPass123 -e "
SHOW STATUS WHERE Variable_name IN (
    'Threads_connected',
    'Connections',
    'Aborted_connects',
    'Max_used_connections'
);"
EOF

chmod +x /usr/local/bin/monitor-databases.sh

# Create cron job for monitoring
echo "*/5 * * * * /usr/local/bin/monitor-databases.sh >> /var/log/db-monitor.log" | sudo crontab -

Example 3: Backup Strategy for Clustered Databases

# Create backup script for Galera cluster
cat > /usr/local/bin/backup-galera.sh << 'EOF'
#!/bin/bash

BACKUP_DIR="/backup/mysql/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$BACKUP_DIR"

# Backup from least loaded node
LEAST_LOADED=$(mysql -h 192.168.1.100 -u monitor -pMonitorPass123 -e "
SELECT VARIABLE_VALUE
FROM information_schema.GLOBAL_STATUS
WHERE VARIABLE_NAME = 'wsrep_local_recv_queue_avg'
ORDER BY VARIABLE_VALUE ASC
LIMIT 1;" -s)

echo "Backing up from node with lowest queue..."

# Perform backup using mariabackup
mariabackup --backup \
    --target-dir="$BACKUP_DIR" \
    --user=root \
    --password=YourRootPassword \
    --host=192.168.1.10

# Compress backup
tar czf "$BACKUP_DIR.tar.gz" -C "$BACKUP_DIR" .
rm -rf "$BACKUP_DIR"

echo "Backup completed: $BACKUP_DIR.tar.gz"

# Rotate old backups (keep last 7 days)
find /backup/mysql -name "*.tar.gz" -mtime +7 -delete
EOF

chmod +x /usr/local/bin/backup-galera.sh

# Schedule daily backups
echo "0 2 * * * /usr/local/bin/backup-galera.sh" | sudo crontab -

🚨 Fix Common Clustering Problems

Let’s solve the most frequent database clustering issues! 🛠️

Problem 1: Split Brain in Galera Cluster

Symptoms: Nodes can’t agree on cluster state Solution:

# Check cluster status on all nodes
mysql -u root -p -e "SHOW STATUS LIKE 'wsrep_cluster_status';"

# If split-brain detected, bootstrap from most advanced node
# Find most advanced node
mysql -u root -p -e "SHOW STATUS LIKE 'wsrep_last_committed';"

# Stop all nodes
sudo systemctl stop mariadb

# Bootstrap from most advanced node
sudo galera_new_cluster

# Start other nodes normally
sudo systemctl start mariadb

Problem 2: Replication Lag

Symptoms: Slave servers falling behind master Solution:

# Check replication lag
mysql -u root -p -e "SHOW SLAVE STATUS\G" | grep Seconds_Behind_Master

# Optimize slave performance
mysql -u root -p << 'EOF'
SET GLOBAL slave_parallel_threads = 4;
SET GLOBAL slave_parallel_mode = 'optimistic';
SET GLOBAL slave_domain_parallel_threads = 2;
EOF

# Skip problematic transaction if needed
mysql -u root -p << 'EOF'
STOP SLAVE;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
START SLAVE;
EOF

Problem 3: Connection Refused Through HAProxy

Symptoms: Can’t connect to database through load balancer Solution:

# Check HAProxy status
sudo systemctl status haproxy

# Verify backend server status
echo "show servers state" | sudo socat stdio /var/run/haproxy/admin.sock

# Check firewall rules
sudo firewall-cmd --list-all

# Test direct connection to backends
mysql -h 192.168.1.10 -u monitor -pMonitorPass123 -e "SELECT 1;"

# Restart HAProxy
sudo systemctl restart haproxy

Problem 4: Node Won’t Join Cluster

Symptoms: New node can’t join existing cluster Solution:

# Check network connectivity
ping -c 3 other-cluster-nodes

# Verify cluster address configuration
grep wsrep_cluster_address /etc/my.cnf.d/galera.cnf

# Check for port blocks
nc -zv 192.168.1.10 4567

# Force SST (State Snapshot Transfer)
sudo systemctl stop mariadb
sudo rm -rf /var/lib/mysql/*
sudo mysql_install_db --user=mysql
sudo systemctl start mariadb

📋 Database Clustering Commands Summary

Essential clustering commands at your fingertips! ⚡

Command	Purpose
`galera_new_cluster`	Bootstrap Galera cluster
`SHOW STATUS LIKE 'wsrep%'`	Check Galera status
`SHOW SLAVE STATUS\G`	Check MySQL replication
`pg_stat_replication`	PostgreSQL replication status
`mariabackup --backup`	Backup Galera node
`CHANGE MASTER TO`	Configure MySQL slave
`pg_basebackup`	PostgreSQL base backup
`show stat` (HAProxy)	Load balancer statistics

💡 Best Practices for Database HA

Master these tips for bulletproof database clusters! 🎯

🔢 Odd Number of Nodes - Use 3, 5, or 7 nodes to avoid split-brain
🌐 Network Quality - Low latency is crucial for synchronous replication
📊 Monitor Everything - Track replication lag, cluster status, performance
💾 Regular Backups - Even with replication, backups are essential
🔧 Test Failover - Regularly test automatic failover procedures
📝 Document Procedures - Create runbooks for common issues
🚀 Tune Performance - Optimize for your specific workload
🛡️ Security First - Encrypt replication traffic
⚖️ Load Balance Reads - Distribute read queries across replicas
🔄 Automate Recovery - Script common recovery procedures

🏆 What You’ve Accomplished

Congratulations on mastering database high availability! 🎉 You’ve achieved:

✅ Galera Cluster deployment with multi-master replication
✅ MySQL master-slave replication configured
✅ PostgreSQL streaming replication established
✅ HAProxy load balancing for connection distribution
✅ Keepalived virtual IP for automatic failover
✅ Monitoring and alerting systems implemented
✅ Backup strategies for clustered databases
✅ Performance optimization techniques applied
✅ Troubleshooting skills for common issues
✅ Disaster recovery procedures documented

🎯 Why These Skills Matter

Your database HA expertise ensures business continuity! 🌟 With these skills, you can:

Immediate Benefits:

🚀 Achieve 99.99% uptime for critical databases
⚡ Scale read performance by 10x or more
🛡️ Survive hardware failures without data loss
💰 Save thousands per hour of prevented downtime

Long-term Value:

🏆 Become the database reliability expert
💼 Design enterprise-grade database architectures
🌍 Build globally distributed database systems
🚀 Enable business growth without limits

You’re now equipped to build database systems that never go down, scale infinitely, and recover automatically from disasters! Your databases are now as reliable as the sunrise! 🌟

Keep clustering, keep scaling! 🙌

🗄️ AlmaLinux Database Replication & Clustering: High Availability Guide

Table of Contents

🗄️ AlmaLinux Database Replication & Clustering: High Availability Guide

🤔 Why is Database Replication & Clustering Important?

🎯 What You Need

📝 Setting Up MariaDB Galera Cluster

Installing MariaDB and Galera

Configuring Galera Nodes

Testing Galera Replication

🔧 MySQL Master-Slave Replication

Configuring Master Server

Configuring Slave Server

🌟 PostgreSQL Streaming Replication

Configuring Primary PostgreSQL Server

Configuring Standby PostgreSQL Server

✅ Load Balancing with HAProxy

Installing and Configuring HAProxy

Virtual IP with Keepalived

🎮 Quick Examples

Example 1: Automatic Failover Testing

Example 2: Database Performance Monitoring

Example 3: Backup Strategy for Clustered Databases

🚨 Fix Common Clustering Problems

Problem 1: Split Brain in Galera Cluster

Problem 2: Replication Lag

Problem 3: Connection Refused Through HAProxy

Problem 4: Node Won’t Join Cluster

📋 Database Clustering Commands Summary

💡 Best Practices for Database HA

🏆 What You’ve Accomplished

🎯 Why These Skills Matter

Share this article

🗄️ AlmaLinux Database Replication & Clustering: High Availability Guide

Table of Contents

🗄️ AlmaLinux Database Replication & Clustering: High Availability Guide

🤔 Why is Database Replication & Clustering Important?

🎯 What You Need

📝 Setting Up MariaDB Galera Cluster

Installing MariaDB and Galera

Configuring Galera Nodes

Testing Galera Replication

🔧 MySQL Master-Slave Replication

Configuring Master Server

Configuring Slave Server

🌟 PostgreSQL Streaming Replication

Configuring Primary PostgreSQL Server

Configuring Standby PostgreSQL Server

✅ Load Balancing with HAProxy

Installing and Configuring HAProxy

Virtual IP with Keepalived

🎮 Quick Examples

Example 1: Automatic Failover Testing

Example 2: Database Performance Monitoring

Example 3: Backup Strategy for Clustered Databases

🚨 Fix Common Clustering Problems

Problem 1: Split Brain in Galera Cluster

Problem 2: Replication Lag

Problem 3: Connection Refused Through HAProxy

Problem 4: Node Won’t Join Cluster

📋 Database Clustering Commands Summary

💡 Best Practices for Database HA

🏆 What You’ve Accomplished

🎯 Why These Skills Matter

Share this article

Related Articles

🗄️ AlmaLinux Database Server Setup: Complete MySQL & PostgreSQL Guide

🌐 Building a Complete LAMP Stack on AlmaLinux: Your Gateway to Web Development Excellence

Installing MariaDB on AlmaLinux: Complete Database Setup Guide

Scan QR Code