๐พ Database Replication with MariaDB on AlmaLinux: Never Lose Data
Lost your database once? Never again! ๐ฑ I learned about replication the hard way when our main database crashed and we lost 3 days of customer orders. $30,000 gone! Thatโs when I discovered MariaDB replication - now our data is copied to multiple servers in real-time. One crashes? No problem, we keep running! Today Iโm showing you how to set up bulletproof database replication on AlmaLinux. Your data will be safer than Fort Knox! ๐
๐ค Why Database Replication is Essential
Running a single database is like walking a tightrope without a net! Hereโs why you need replication:
- ๐ Zero Data Loss - Every change copied instantly
- โก Load Balancing - Spread reads across servers
- ๐ High Availability - Automatic failover
- ๐พ Live Backups - Real-time backup servers
- ๐ Geographic Distribution - Servers worldwide
- ๐ Scale Reads - Unlimited read replicas
True story: Our e-commerce site handles 10,000 orders/day. With replication, we survived a complete datacenter outage with ZERO downtime! ๐ช
๐ฏ What You Need
Before we replicate everything, ensure you have:
- โ 2+ AlmaLinux servers (master and slave)
- โ MariaDB installed on all servers
- โ Network connectivity between servers
- โ Root or sudo access
- โ 45 minutes to master replication
- โ Coffee (database work needs focus! โ)
๐ Step 1: Install and Configure MariaDB
Letโs set up MariaDB on all servers!
Install MariaDB
# On all servers
sudo dnf install -y mariadb-server mariadb
# Enable and start MariaDB
sudo systemctl enable --now mariadb
# Secure installation
sudo mysql_secure_installation
# Set root password: yes
# Remove anonymous users: yes
# Disallow root remote login: yes (we'll create replication user)
# Remove test database: yes
# Reload privileges: yes
# Check version
mysql --version
# Test connection
mysql -u root -p
Configure Master Server
# Edit MariaDB config on MASTER
sudo nano /etc/my.cnf.d/mariadb-server.cnf
[mysqld]
# Server ID (unique for each server)
server-id = 1
# Binary logging (required for replication)
log-bin = mysql-bin
binlog-format = mixed
max_binlog_size = 100M
expire_logs_days = 7
# Database to replicate (optional)
binlog-do-db = production
# binlog-ignore-db = test
# Performance tuning
innodb_buffer_pool_size = 1G
innodb_log_file_size = 256M
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
# Network settings
bind-address = 0.0.0.0
max_connections = 500
# Restart MariaDB
sudo systemctl restart mariadb
Configure Slave Server
# Edit MariaDB config on SLAVE
sudo nano /etc/my.cnf.d/mariadb-server.cnf
[mysqld]
# Server ID (must be different from master)
server-id = 2
# Binary logging (for slave to become master)
log-bin = mysql-bin
binlog-format = mixed
# Relay log
relay-log = relay-bin
relay-log-index = relay-bin.index
# Read-only (optional but recommended)
read_only = 1
# Same performance settings
innodb_buffer_pool_size = 1G
innodb_log_file_size = 256M
bind-address = 0.0.0.0
# Restart MariaDB
sudo systemctl restart mariadb
๐ง Step 2: Set Up Master-Slave Replication
Time to connect master and slave!
Create Replication User on Master
# On MASTER server
mysql -u root -p
-- Create replication user
CREATE USER 'repl_user'@'%' IDENTIFIED BY 'SecureReplicationPass123!';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'%';
FLUSH PRIVILEGES;
-- Check master status (IMPORTANT - note these values!)
SHOW MASTER STATUS;
+------------------+----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+----------+--------------+------------------+
| mysql-bin.000001 | 328 | production | |
+------------------+----------+--------------+------------------+
-- Lock tables for consistent snapshot (optional)
FLUSH TABLES WITH READ LOCK;
-- Keep this session open or note the values!
Export Data from Master (if existing data)
# In another terminal on MASTER
# Export database
mysqldump -u root -p --all-databases --master-data=2 --single-transaction > master_dump.sql
# Or specific database
mysqldump -u root -p production --master-data=2 --single-transaction > production_dump.sql
# Copy to slave
scp master_dump.sql slave-server:/tmp/
# Back in MySQL, unlock tables
mysql -u root -p
UNLOCK TABLES;
Configure Slave
# On SLAVE server
# Import data if needed
mysql -u root -p < /tmp/master_dump.sql
# Connect to MySQL
mysql -u root -p
-- Stop slave if running
STOP SLAVE;
-- Configure slave to connect to master
CHANGE MASTER TO
MASTER_HOST='master-server-ip',
MASTER_USER='repl_user',
MASTER_PASSWORD='SecureReplicationPass123!',
MASTER_LOG_FILE='mysql-bin.000001',
MASTER_LOG_POS=328;
-- Start slave
START SLAVE;
-- Check slave status
SHOW SLAVE STATUS\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: master-server-ip
Master_User: repl_user
Master_Port: 3306
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Last_Errno: 0
Seconds_Behind_Master: 0
-- Both Slave_IO_Running and Slave_SQL_Running should be "Yes"
Test Replication
# On MASTER
mysql -u root -p
CREATE DATABASE test_repl;
USE test_repl;
CREATE TABLE test (id INT PRIMARY KEY, data VARCHAR(100));
INSERT INTO test VALUES (1, 'Hello from Master!');
# On SLAVE
mysql -u root -p
SHOW DATABASES; -- Should see test_repl
USE test_repl;
SELECT * FROM test; -- Should see the data!
๐ Step 3: Advanced Replication Setups
Letโs explore more complex configurations!
Master-Master Replication
# Configure Server 1
sudo nano /etc/my.cnf.d/mariadb-server.cnf
[mysqld]
server-id = 1
log-bin = mysql-bin
binlog-format = mixed
relay-log = relay-bin
# Auto-increment for multi-master
auto_increment_increment = 2
auto_increment_offset = 1
# Restart
sudo systemctl restart mariadb
# Configure Server 2
sudo nano /etc/my.cnf.d/mariadb-server.cnf
[mysqld]
server-id = 2
log-bin = mysql-bin
binlog-format = mixed
relay-log = relay-bin
# Auto-increment for multi-master
auto_increment_increment = 2
auto_increment_offset = 2
# Restart
sudo systemctl restart mariadb
# On Server 1
mysql -u root -p
CREATE USER 'repl_user'@'%' IDENTIFIED BY 'Password123!';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'%';
SHOW MASTER STATUS; -- Note values
# On Server 2
mysql -u root -p
CREATE USER 'repl_user'@'%' IDENTIFIED BY 'Password123!';
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'%';
CHANGE MASTER TO
MASTER_HOST='server1-ip',
MASTER_USER='repl_user',
MASTER_PASSWORD='Password123!',
MASTER_LOG_FILE='mysql-bin.000001',
MASTER_LOG_POS=328;
START SLAVE;
SHOW MASTER STATUS; -- Note values for Server 1
# Back on Server 1
mysql -u root -p
CHANGE MASTER TO
MASTER_HOST='server2-ip',
MASTER_USER='repl_user',
MASTER_PASSWORD='Password123!',
MASTER_LOG_FILE='mysql-bin.000001',
MASTER_LOG_POS=328;
START SLAVE;
Delayed Replication (for recovery)
# On SLAVE - delay by 1 hour
mysql -u root -p
STOP SLAVE;
CHANGE MASTER TO MASTER_DELAY = 3600;
START SLAVE;
SHOW SLAVE STATUS\G
-- Look for: SQL_Delay: 3600
Multi-Source Replication
# On slave receiving from multiple masters
mysql -u root -p
-- Configure first master
CHANGE MASTER 'master1' TO
MASTER_HOST='master1-ip',
MASTER_USER='repl_user',
MASTER_PASSWORD='Password123!',
MASTER_LOG_FILE='mysql-bin.000001',
MASTER_LOG_POS=328;
-- Configure second master
CHANGE MASTER 'master2' TO
MASTER_HOST='master2-ip',
MASTER_USER='repl_user',
MASTER_PASSWORD='Password123!',
MASTER_LOG_FILE='mysql-bin.000001',
MASTER_LOG_POS=328;
-- Start all slaves
START ALL SLAVES;
-- Check status
SHOW ALL SLAVES STATUS\G
โ Step 4: Monitoring and Failover
Keep your replication healthy!
Monitoring Script
#!/bin/bash
# Monitor replication health
cat > /usr/local/bin/check-replication.sh << 'EOF'
#!/bin/bash
MYSQL_USER="root"
MYSQL_PASS="YourPassword"
ALERT_EMAIL="[email protected]"
check_slave_status() {
# Get slave status
STATUS=$(mysql -u$MYSQL_USER -p$MYSQL_PASS -e "SHOW SLAVE STATUS\G" 2>/dev/null)
if [ -z "$STATUS" ]; then
echo "โ Not a slave server or cannot connect"
return 1
fi
# Check IO thread
IO_RUNNING=$(echo "$STATUS" | grep "Slave_IO_Running:" | awk '{print $2}')
SQL_RUNNING=$(echo "$STATUS" | grep "Slave_SQL_Running:" | awk '{print $2}')
SECONDS_BEHIND=$(echo "$STATUS" | grep "Seconds_Behind_Master:" | awk '{print $2}')
LAST_ERROR=$(echo "$STATUS" | grep "Last_Error:" | cut -d: -f2-)
echo "๐ Replication Status Report"
echo "============================"
echo "IO Thread: $IO_RUNNING"
echo "SQL Thread: $SQL_RUNNING"
echo "Seconds Behind: $SECONDS_BEHIND"
# Check for problems
if [ "$IO_RUNNING" != "Yes" ] || [ "$SQL_RUNNING" != "Yes" ]; then
echo "โ ๏ธ ALERT: Replication is broken!"
echo "Last Error: $LAST_ERROR"
# Send alert
echo "Replication broken on $(hostname)" | mail -s "Database Alert" $ALERT_EMAIL
return 1
fi
# Check lag
if [ "$SECONDS_BEHIND" != "NULL" ] && [ "$SECONDS_BEHIND" -gt 60 ]; then
echo "โ ๏ธ WARNING: Replication lag is ${SECONDS_BEHIND} seconds"
else
echo "โ
Replication is healthy"
fi
}
check_master_status() {
# Check if binary logging is enabled
BIN_LOG=$(mysql -u$MYSQL_USER -p$MYSQL_PASS -e "SHOW VARIABLES LIKE 'log_bin'" 2>/dev/null | grep ON)
if [ -n "$BIN_LOG" ]; then
echo "โ
Binary logging is enabled (Master capable)"
# Show connected slaves
SLAVES=$(mysql -u$MYSQL_USER -p$MYSQL_PASS -e "SHOW SLAVE HOSTS" 2>/dev/null)
echo "$SLAVES"
fi
}
# Run checks
check_slave_status
check_master_status
EOF
chmod +x /usr/local/bin/check-replication.sh
# Add to crontab
echo "*/5 * * * * /usr/local/bin/check-replication.sh" | crontab -
Automatic Failover Script
#!/bin/bash
# Automatic failover script
cat > /usr/local/bin/failover.sh << 'EOF'
#!/bin/bash
MASTER_IP="192.168.1.10"
SLAVE_IP="192.168.1.11"
VIP="192.168.1.100" # Virtual IP
promote_slave_to_master() {
echo "๐ Promoting slave to master..."
# Stop slave
mysql -u root -p$MYSQL_PASS -e "STOP SLAVE"
# Reset slave config
mysql -u root -p$MYSQL_PASS -e "RESET SLAVE ALL"
# Make writable
mysql -u root -p$MYSQL_PASS -e "SET GLOBAL read_only = 0"
# Take over VIP
ip addr add $VIP/24 dev eth0
arping -c 3 -A $VIP -I eth0
echo "โ
Slave promoted to master"
}
check_master_health() {
# Try to connect to master
if ! mysql -h $MASTER_IP -u monitor -pMonitorPass -e "SELECT 1" &>/dev/null; then
echo "โ Master is down!"
promote_slave_to_master
else
echo "โ
Master is healthy"
fi
}
# Run check
check_master_health
EOF
chmod +x /usr/local/bin/failover.sh
๐ฎ Quick Examples
Example 1: Load Balancing Reads ๐
#!/bin/bash
# HAProxy configuration for read load balancing
# Install HAProxy
sudo dnf install -y haproxy
# Configure HAProxy
cat > /etc/haproxy/haproxy.cfg << 'EOF'
global
maxconn 4096
log 127.0.0.1 local0
defaults
mode tcp
timeout connect 5000ms
timeout client 50000ms
timeout server 50000ms
# Write queries to master
listen mysql-write
bind *:3306
mode tcp
option mysql-check user haproxy_check
server master 192.168.1.10:3306 check
# Read queries to slaves
listen mysql-read
bind *:3307
mode tcp
balance roundrobin
option mysql-check user haproxy_check
server slave1 192.168.1.11:3306 check
server slave2 192.168.1.12:3306 check
server slave3 192.168.1.13:3306 check
# Stats page
listen stats
bind *:8080
stats enable
stats uri /stats
stats refresh 10s
EOF
# Create check user on all MySQL servers
mysql -u root -p << EOF
CREATE USER 'haproxy_check'@'%';
FLUSH PRIVILEGES;
EOF
# Start HAProxy
sudo systemctl enable --now haproxy
echo "โ
Load balancer configured!"
echo "๐ Writes: port 3306 (master only)"
echo "๐ Reads: port 3307 (load balanced)"
echo "๐ Stats: http://server:8080/stats"
Example 2: Backup Strategy with Replication ๐พ
#!/bin/bash
# Intelligent backup using replication
backup_from_slave() {
BACKUP_DIR="/backup/mysql/$(date +%Y%m%d)"
SLAVE_HOST="slave.example.com"
echo "๐พ Starting backup from slave..."
# Stop replication temporarily
mysql -h $SLAVE_HOST -u root -p$PASS -e "STOP SLAVE SQL_THREAD"
# Get slave position
SLAVE_STATUS=$(mysql -h $SLAVE_HOST -u root -p$PASS -e "SHOW SLAVE STATUS\G")
echo "$SLAVE_STATUS" > $BACKUP_DIR/slave_status.txt
# Perform backup
mkdir -p $BACKUP_DIR
# Full backup with binary log position
mysqldump -h $SLAVE_HOST -u root -p$PASS \
--all-databases \
--single-transaction \
--master-data=2 \
--events \
--routines \
--triggers \
| gzip > $BACKUP_DIR/full_backup.sql.gz
# Resume replication
mysql -h $SLAVE_HOST -u root -p$PASS -e "START SLAVE SQL_THREAD"
# Verify backup
if [ -f "$BACKUP_DIR/full_backup.sql.gz" ]; then
SIZE=$(ls -lh $BACKUP_DIR/full_backup.sql.gz | awk '{print $5}')
echo "โ
Backup complete: $SIZE"
# Test restore to another instance
echo "๐งช Testing backup..."
gunzip < $BACKUP_DIR/full_backup.sql.gz | \
mysql -h test-server -u root -p$PASS
if [ $? -eq 0 ]; then
echo "โ
Backup verified successfully"
else
echo "โ Backup verification failed!"
fi
fi
# Cleanup old backups (keep 30 days)
find /backup/mysql -type d -mtime +30 -exec rm -rf {} \;
}
# Incremental backup using binary logs
incremental_backup() {
BINLOG_DIR="/var/lib/mysql"
BACKUP_DIR="/backup/mysql/incremental/$(date +%Y%m%d-%H%M%S)"
mkdir -p $BACKUP_DIR
# Copy binary logs
cp $BINLOG_DIR/mysql-bin.[0-9]* $BACKUP_DIR/
# Flush logs to start new binlog
mysql -u root -p$PASS -e "FLUSH LOGS"
echo "โ
Incremental backup saved to $BACKUP_DIR"
}
# Run backups
backup_from_slave
incremental_backup
Example 3: Replication Lag Monitor ๐
#!/bin/bash
# Real-time replication lag monitoring
cat > /usr/local/bin/lag-monitor.py << 'EOF'
#!/usr/bin/env python3
import pymysql
import time
import matplotlib.pyplot as plt
from datetime import datetime
import numpy as np
class ReplicationMonitor:
def __init__(self, slaves):
self.slaves = slaves
self.lag_history = {slave: [] for slave in slaves}
self.timestamps = []
def check_lag(self, host, user, password):
"""Check replication lag for a slave"""
try:
conn = pymysql.connect(host=host, user=user, password=password)
cursor = conn.cursor()
cursor.execute("SHOW SLAVE STATUS")
result = cursor.fetchone()
if result:
# Column 32 is Seconds_Behind_Master
lag = result[32]
return lag if lag is not None else 0
return None
except Exception as e:
print(f"Error checking {host}: {e}")
return None
finally:
if conn:
conn.close()
def monitor_loop(self):
"""Main monitoring loop"""
plt.ion()
fig, ax = plt.subplots()
while True:
current_time = datetime.now()
self.timestamps.append(current_time)
# Check each slave
for slave in self.slaves:
lag = self.check_lag(slave['host'], slave['user'], slave['pass'])
self.lag_history[slave['host']].append(lag if lag else 0)
# Alert if lag is high
if lag and lag > 60:
print(f"โ ๏ธ High lag on {slave['host']}: {lag} seconds")
self.send_alert(slave['host'], lag)
# Update plot
ax.clear()
for slave in self.slaves:
ax.plot(self.lag_history[slave['host']][-100:],
label=slave['host'])
ax.set_ylabel('Lag (seconds)')
ax.set_xlabel('Time')
ax.set_title('Replication Lag Monitor')
ax.legend()
ax.grid(True)
plt.pause(1)
time.sleep(5)
def send_alert(self, host, lag):
"""Send alert for high lag"""
# Implement your alerting here
pass
# Configure slaves to monitor
slaves = [
{'host': '192.168.1.11', 'user': 'monitor', 'pass': 'password'},
{'host': '192.168.1.12', 'user': 'monitor', 'pass': 'password'},
{'host': '192.168.1.13', 'user': 'monitor', 'pass': 'password'},
]
# Start monitoring
monitor = ReplicationMonitor(slaves)
monitor.monitor_loop()
EOF
# Install dependencies
pip3 install pymysql matplotlib
# Run monitor
python3 /usr/local/bin/lag-monitor.py
๐จ Fix Common Problems
Problem 1: Replication Broken โ
Slave not replicating?
# Check slave status
SHOW SLAVE STATUS\G
# Common fixes:
# Skip one error
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
START SLAVE;
# Reset slave completely
STOP SLAVE;
RESET SLAVE;
CHANGE MASTER TO ... ; # Re-configure
START SLAVE;
# Fix duplicate key errors
# On slave:
SET GLOBAL slave_skip_errors = 1062;
Problem 2: Replication Lag โ
Slave falling behind?
# Check what's causing lag
SHOW PROCESSLIST;
# Optimize slow queries
# On master:
SET GLOBAL binlog_format = 'ROW';
# Parallel replication (MariaDB 10.0+)
# On slave:
SET GLOBAL slave_parallel_threads = 4;
SET GLOBAL slave_parallel_mode = 'optimistic';
# Skip large transactions
SET GLOBAL slave_skip_errors = 'all'; # Dangerous!
Problem 3: Binary Logs Filling Disk โ
Out of disk space?
# Check binary log usage
SHOW BINARY LOGS;
# Purge old logs
PURGE BINARY LOGS BEFORE '2024-01-01 00:00:00';
PURGE BINARY LOGS TO 'mysql-bin.000100';
# Set automatic expiry
SET GLOBAL expire_logs_days = 7;
# In config file
expire_logs_days = 7
max_binlog_size = 100M
Problem 4: Canโt Connect to Master โ
Network or permission issues?
# Test connection from slave
mysql -h master-ip -u repl_user -p
# Check firewall on master
sudo firewall-cmd --add-service=mysql --permanent
sudo firewall-cmd --reload
# Check user permissions on master
SELECT User, Host FROM mysql.user WHERE User='repl_user';
SHOW GRANTS FOR 'repl_user'@'%';
# Fix permissions
GRANT REPLICATION SLAVE ON *.* TO 'repl_user'@'slave-ip';
๐ Simple Commands Summary
Task | Command |
---|---|
๐ Check slave status | SHOW SLAVE STATUS\G |
โถ๏ธ Start slave | START SLAVE |
โน๏ธ Stop slave | STOP SLAVE |
๐ Show master position | SHOW MASTER STATUS |
๐ Reset slave | RESET SLAVE |
โญ๏ธ Skip error | SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1 |
๐ Show binary logs | SHOW BINARY LOGS |
๐๏ธ Purge old logs | PURGE BINARY LOGS BEFORE 'date' |
๐ก Tips for Success
- Monitor Constantly ๐ - Lag kills applications
- Test Failover ๐ - Practice before you need it
- Backup Slaves ๐พ - Never backup from master
- Document Everything ๐ - Replication config is complex
- Use Monitoring Tools ๐ - Percona Toolkit helps
- Plan Maintenance ๐ง - Slaves for upgrades first
Pro tip: Always have one more slave than you think you need. When one fails (and it will), youโll still have redundancy! ๐ฏ
๐ What You Learned
Youโre now a replication master! You can:
- โ Set up master-slave replication
- โ Configure master-master replication
- โ Monitor replication health
- โ Handle automatic failover
- โ Implement backup strategies
- โ Troubleshoot replication issues
- โ Scale database reads
๐ฏ Why This Matters
Database replication provides:
- ๐ก๏ธ Data protection
- โก Read scalability
- ๐ Zero-downtime maintenance
- ๐ Geographic distribution
- ๐พ Real-time backups
- ๐ช Business continuity
Last week, our primary database server literally caught fire (faulty PSU). The slave took over automatically. We replaced the hardware, resynced, and nobody even noticed. Thatโs the power of replication! ๐ฅโก๏ธโ
Remember: Your data is your business. One database is none, two is one, three is safe! ๐พ
Happy replicating! May your data be safe and your slaves stay in sync! ๐โจ