๐ Network Bonding and Teaming in AlmaLinux: High-Speed Redundancy
Ever had your network go down and lost everything? ๐ฑ Or wished you could combine two network cards for double the speed? Thatโs where network bonding comes in! I learned this the hard way when a single network cable failure took down our entire web service for 3 hours. Never again! Today Iโm showing you how to set up network bonding and teaming in AlmaLinux - combine multiple network interfaces for redundancy AND performance. Letโs make your network bulletproof! ๐ก๏ธ
๐ค Why Network Bonding Matters
Bonding isnโt just for backup - itโs about performance too! Hereโs why you need it:
- ๐ Automatic Failover - One link dies? Traffic continues!
- โก Increased Bandwidth - Combine multiple NICs for speed
- ๐ฏ Load Balancing - Distribute traffic across links
- ๐ฐ Cost Effective - Cheaper than 10Gb Ethernet
- ๐ข Enterprise Ready - Standard in data centers
- ๐ Zero Downtime - Maintenance without interruption
True story: We bonded four 1Gb NICs and got nearly 4Gb throughput for a fraction of the cost of 10Gb equipment! ๐ช
๐ฏ What You Need
Before we start bonding networks, ensure you have:
- โ AlmaLinux system with 2+ network interfaces
- โ Root or sudo access
- โ Switch that supports LACP (for some modes)
- โ 20 minutes to bulletproof your network
- โ Cables to test failover (unplug them! ๐)
๐ Step 1: Understanding Bonding vs Teaming
Letโs clarify the options first!
Network Bonding (Traditional)
# Check if bonding module is loaded
lsmod | grep bonding
# Load bonding module
sudo modprobe bonding
# Make permanent
echo "bonding" | sudo tee /etc/modules-load.d/bonding.conf
# View bonding info
cat /sys/class/net/bond0/bonding/mode
cat /proc/net/bonding/bond0
Network Teaming (Modern Alternative)
# Install teamd
sudo dnf install -y teamd NetworkManager-team
# Check team daemon
rpm -qa | grep team
# Teaming advantages:
# - Better performance
# - More features
# - Active development
# - NetworkManager integration
# - D-Bus control interface
Bonding Modes Explained
# Mode 0 (balance-rr) - Round Robin
# Packets sent in sequence across all slaves
# Provides: Load balancing + fault tolerance
# Requires: Switch configuration
# Mode 1 (active-backup) - Active/Standby
# Only one slave active at a time
# Provides: Fault tolerance only
# Requires: No special switch config
# Mode 2 (balance-xor) - XOR Hash
# Source/Dest MAC address determines slave
# Provides: Load balancing + fault tolerance
# Requires: Switch configuration
# Mode 3 (broadcast) - Broadcast
# Everything sent on all slaves
# Provides: Fault tolerance
# Requires: Very specific use cases
# Mode 4 (802.3ad) - Dynamic Link Aggregation (LACP)
# IEEE 802.3ad standard
# Provides: Best performance + fault tolerance
# Requires: LACP-capable switch
# Mode 5 (balance-tlb) - Adaptive Transmit Load Balancing
# Outgoing traffic distributed by load
# Provides: TX load balancing + fault tolerance
# Requires: No special switch config
# Mode 6 (balance-alb) - Adaptive Load Balancing
# Both TX and RX load balancing
# Provides: Full load balancing + fault tolerance
# Requires: No special switch config
๐ง Step 2: Configure Network Bonding
Letโs create a bonded interface!
Method 1: Using nmcli (NetworkManager)
# List network interfaces
nmcli device status
# Create bond interface
sudo nmcli connection add type bond \
con-name bond0 \
ifname bond0 \
mode active-backup
# Or with LACP (mode 4)
sudo nmcli connection add type bond \
con-name bond0 \
ifname bond0 \
mode 802.3ad
# Configure bond options
sudo nmcli connection modify bond0 \
bond.options "mode=active-backup,miimon=100,fail_over_mac=1"
# Add slave interfaces
sudo nmcli connection add type ethernet \
slave-type bond \
con-name bond0-slave1 \
ifname enp0s3 \
master bond0
sudo nmcli connection add type ethernet \
slave-type bond \
con-name bond0-slave2 \
ifname enp0s8 \
master bond0
# Set IP address
sudo nmcli connection modify bond0 \
ipv4.addresses 192.168.1.100/24 \
ipv4.gateway 192.168.1.1 \
ipv4.dns 8.8.8.8 \
ipv4.method manual
# Activate bond
sudo nmcli connection up bond0
# Verify
nmcli connection show bond0
cat /proc/net/bonding/bond0
Method 2: Configuration Files
# Create bond interface config
sudo nano /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
TYPE=Bond
BONDING_MASTER=yes
BOOTPROTO=none
ONBOOT=yes
IPADDR=192.168.1.100
NETMASK=255.255.255.0
GATEWAY=192.168.1.1
BONDING_OPTS="mode=1 miimon=100 fail_over_mac=1"
# Configure first slave
sudo nano /etc/sysconfig/network-scripts/ifcfg-enp0s3
DEVICE=enp0s3
TYPE=Ethernet
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes
# Configure second slave
sudo nano /etc/sysconfig/network-scripts/ifcfg-enp0s8
DEVICE=enp0s8
TYPE=Ethernet
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes
# Restart network
sudo systemctl restart NetworkManager
Advanced Bonding Options
# Configure bonding parameters
sudo nano /etc/sysconfig/network-scripts/ifcfg-bond0
# Add these options:
BONDING_OPTS="mode=802.3ad miimon=100 lacp_rate=fast xmit_hash_policy=layer3+4"
# Options explained:
# miimon=100 - Link monitoring every 100ms
# lacp_rate=fast - Fast LACP negotiation (1s vs 30s)
# xmit_hash_policy - How to distribute packets:
# layer2 - MAC addresses only
# layer2+3 - MAC + IP addresses
# layer3+4 - IP + Port (best for servers)
# primary=enp0s3 - Preferred interface
# primary_reselect - When to switch back to primary
# fail_over_mac=1 - Use same MAC on failover
๐ Step 3: Configure Network Teaming
The modern approach with more features!
Create Team Interface
# Method 1: Using nmcli
sudo nmcli connection add type team \
con-name team0 \
ifname team0 \
config '{"runner": {"name": "activebackup"}}'
# With load balancing
sudo nmcli connection add type team \
con-name team0 \
ifname team0 \
config '{"runner": {"name": "loadbalance"}}'
# With LACP
sudo nmcli connection add type team \
con-name team0 \
ifname team0 \
config '{"runner": {"name": "lacp", "active": true, "fast_rate": true}}'
# Add team slaves
sudo nmcli connection add type ethernet \
slave-type team \
con-name team0-slave1 \
ifname enp0s3 \
master team0
sudo nmcli connection add type ethernet \
slave-type team \
con-name team0-slave2 \
ifname enp0s8 \
master team0
# Configure IP
sudo nmcli connection modify team0 \
ipv4.addresses 192.168.1.100/24 \
ipv4.gateway 192.168.1.1 \
ipv4.dns 8.8.8.8 \
ipv4.method manual
# Activate team
sudo nmcli connection up team0
Advanced Team Configuration
# Create detailed team config
cat > /tmp/team-config.json << 'EOF'
{
"device": "team0",
"runner": {
"name": "lacp",
"active": true,
"fast_rate": true,
"tx_hash": ["eth", "vlan", "ipv4", "ipv6", "tcp", "udp"]
},
"link_watch": {
"name": "ethtool"
},
"ports": {
"enp0s3": {
"prio": 100,
"sticky": true
},
"enp0s8": {
"prio": 50
}
}
}
EOF
# Apply config
sudo nmcli connection modify team0 \
team.config "$(cat /tmp/team-config.json)"
# Monitor team status
sudo teamdctl team0 state
sudo teamdctl team0 config dump
sudo teamnl team0 ports
โ Step 4: Testing and Monitoring
Letโs make sure it really works!
Test Failover
#!/bin/bash
# Test network failover
echo "๐งช Network Failover Test"
echo "========================"
# Continuous ping in background
ping -i 0.2 google.com > /tmp/ping.log &
PING_PID=$!
# Show current active interface
echo "Current active interface:"
if [ -f /proc/net/bonding/bond0 ]; then
grep "Currently Active Slave" /proc/net/bonding/bond0
else
sudo teamdctl team0 state | grep -A2 "active port"
fi
# Disconnect primary interface
echo ""
echo "โ ๏ธ Disconnecting primary interface in 3 seconds..."
sleep 3
sudo ip link set enp0s3 down
# Check failover
sleep 2
echo ""
echo "After failover:"
if [ -f /proc/net/bonding/bond0 ]; then
grep "Currently Active Slave" /proc/net/bonding/bond0
else
sudo teamdctl team0 state | grep -A2 "active port"
fi
# Check packet loss
sleep 3
kill $PING_PID
echo ""
echo "Packet loss during failover:"
grep -c "time=" /tmp/ping.log
grep -c "Destination Host Unreachable" /tmp/ping.log
# Restore interface
sudo ip link set enp0s3 up
echo "โ
Interface restored"
Performance Testing
#!/bin/bash
# Test bonded network performance
test_bandwidth() {
echo "๐ Bandwidth Test"
# Install iperf3 if needed
which iperf3 || sudo dnf install -y iperf3
# Single interface baseline
echo "Single interface test:"
sudo ip link set enp0s8 down
iperf3 -c 192.168.1.200 -t 10
# Both interfaces bonded
echo ""
echo "Bonded interfaces test:"
sudo ip link set enp0s8 up
sleep 2
iperf3 -c 192.168.1.200 -t 10 -P 4
# Show interface statistics
echo ""
echo "Interface statistics:"
cat /proc/net/bonding/bond0 | grep -E "Slave Interface|MII Status|Speed"
}
test_bandwidth
๐ฎ Quick Examples
Example 1: HA Web Server Setup ๐
#!/bin/bash
# High availability network for web server
setup_ha_network() {
echo "๐ Setting up HA Network for Web Server"
# Create active-backup bond
sudo nmcli connection add type bond \
con-name ha-bond \
ifname bond0 \
mode active-backup \
bond.options "miimon=100,fail_over_mac=active,primary=enp0s3"
# Add interfaces
for iface in enp0s3 enp0s8; do
sudo nmcli connection add type ethernet \
slave-type bond \
con-name ha-bond-$iface \
ifname $iface \
master bond0
done
# Configure static IP
sudo nmcli connection modify ha-bond \
ipv4.addresses 192.168.1.10/24 \
ipv4.gateway 192.168.1.1 \
ipv4.dns "8.8.8.8,8.8.4.4" \
ipv4.method manual \
connection.autoconnect yes
# Activate
sudo nmcli connection up ha-bond
# Configure monitoring
cat > /usr/local/bin/monitor-bond.sh << 'SCRIPT'
#!/bin/bash
while true; do
if ! grep -q "up" /sys/class/net/bond0/operstate; then
echo "ALERT: Bond interface down!" | \
mail -s "Network Alert on $(hostname)" [email protected]
fi
sleep 10
done
SCRIPT
chmod +x /usr/local/bin/monitor-bond.sh
# Create systemd service
cat > /etc/systemd/system/bond-monitor.service << 'SERVICE'
[Unit]
Description=Bond Network Monitor
After=network-online.target
[Service]
Type=simple
ExecStart=/usr/local/bin/monitor-bond.sh
Restart=always
[Install]
WantedBy=multi-user.target
SERVICE
sudo systemctl enable --now bond-monitor
echo "โ
HA Network configured!"
echo "๐ Status: $(cat /proc/net/bonding/bond0 | grep 'Currently Active')"
}
setup_ha_network
Example 2: Load-Balanced Database Server ๐พ
#!/bin/bash
# LACP bond for database server
setup_db_network() {
echo "๐พ Configuring LACP Bond for Database"
# Create LACP team (better than bonding for LACP)
cat > /tmp/lacp-team.json << 'EOF'
{
"runner": {
"name": "lacp",
"active": true,
"fast_rate": true,
"tx_hash": ["ipv4", "tcp", "udp"],
"agg_select_policy": "bandwidth"
},
"link_watch": {
"name": "ethtool",
"delay_up": 5,
"delay_down": 1
}
}
EOF
# Create team interface
sudo nmcli connection add type team \
con-name db-team \
ifname team0 \
team.config "$(cat /tmp/lacp-team.json)"
# Add 4 interfaces for 4Gb aggregate
for i in {3..6}; do
sudo nmcli connection add type ethernet \
slave-type team \
con-name db-team-enp0s$i \
ifname enp0s$i \
master team0
done
# Configure IP and Jumbo frames
sudo nmcli connection modify db-team \
ipv4.addresses 10.0.1.50/24 \
ipv4.method manual \
802-3-ethernet.mtu 9000
# Apply MTU to slaves
for i in {3..6}; do
sudo nmcli connection modify db-team-enp0s$i \
802-3-ethernet.mtu 9000
done
# Activate
sudo nmcli connection up db-team
# Performance tuning
cat > /etc/sysctl.d/99-db-network.conf << 'EOF'
# Database network optimization
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
net.ipv4.tcp_rmem = 4096 87380 134217728
net.ipv4.tcp_wmem = 4096 65536 134217728
net.core.netdev_max_backlog = 30000
net.ipv4.tcp_congestion_control = bbr
net.ipv4.tcp_mtu_probing = 1
EOF
sudo sysctl -p /etc/sysctl.d/99-db-network.conf
echo "โ
Database network optimized!"
sudo teamdctl team0 state
}
setup_db_network
Example 3: Dynamic Failover Manager ๐
#!/bin/bash
# Intelligent network failover manager
cat > /usr/local/bin/smart-failover.py << 'EOF'
#!/usr/bin/env python3
import subprocess
import time
import json
import smtplib
from email.mime.text import MIMEText
class NetworkManager:
def __init__(self, interface="bond0"):
self.interface = interface
self.history = []
def get_bond_status(self):
"""Get current bond status"""
try:
with open(f"/proc/net/bonding/{self.interface}") as f:
return f.read()
except:
return None
def get_active_slave(self):
"""Get currently active slave"""
status = self.get_bond_status()
if status:
for line in status.split('\n'):
if 'Currently Active Slave' in line:
return line.split(':')[1].strip()
return None
def get_slave_stats(self, slave):
"""Get statistics for a slave interface"""
try:
stats = {}
with open(f"/sys/class/net/{slave}/statistics/rx_errors") as f:
stats['rx_errors'] = int(f.read().strip())
with open(f"/sys/class/net/{slave}/statistics/tx_errors") as f:
stats['tx_errors'] = int(f.read().strip())
# Get speed and duplex
result = subprocess.run(['ethtool', slave],
capture_output=True, text=True)
for line in result.stdout.split('\n'):
if 'Speed:' in line:
stats['speed'] = line.split(':')[1].strip()
if 'Duplex:' in line:
stats['duplex'] = line.split(':')[1].strip()
return stats
except:
return None
def check_health(self):
"""Check network health and make decisions"""
slaves = self.get_all_slaves()
active = self.get_active_slave()
issues = []
for slave in slaves:
stats = self.get_slave_stats(slave)
if stats:
# Check for errors
if stats.get('rx_errors', 0) > 100:
issues.append(f"{slave}: High RX errors ({stats['rx_errors']})")
if stats.get('tx_errors', 0) > 100:
issues.append(f"{slave}: High TX errors ({stats['tx_errors']})")
# Check link speed
if 'speed' in stats and '1000Mb' not in stats['speed']:
issues.append(f"{slave}: Degraded speed ({stats['speed']})")
if issues:
self.alert(f"Network issues detected:\n" + "\n".join(issues))
# Force failover if active slave has issues
if active and any(active in issue for issue in issues):
self.force_failover(active)
def force_failover(self, current_slave):
"""Force failover to backup interface"""
print(f"โ ๏ธ Forcing failover from {current_slave}")
subprocess.run(['sudo', 'ip', 'link', 'set', current_slave, 'down'])
time.sleep(2)
subprocess.run(['sudo', 'ip', 'link', 'set', current_slave, 'up'])
new_active = self.get_active_slave()
print(f"โ
Failover complete. New active: {new_active}")
self.alert(f"Forced failover from {current_slave} to {new_active}")
def get_all_slaves(self):
"""Get all slave interfaces"""
slaves = []
status = self.get_bond_status()
if status:
for line in status.split('\n'):
if 'Slave Interface:' in line:
slaves.append(line.split(':')[1].strip())
return slaves
def alert(self, message):
"""Send alert notification"""
print(f"๐จ ALERT: {message}")
# Add email/Slack/monitoring integration here
def monitor_loop(self):
"""Main monitoring loop"""
print(f"๐ Monitoring {self.interface}...")
while True:
try:
self.check_health()
# Log current status
active = self.get_active_slave()
if active:
stats = self.get_slave_stats(active)
print(f"โ
Active: {active} | Speed: {stats.get('speed', 'Unknown')}")
time.sleep(10)
except KeyboardInterrupt:
print("\n๐ Monitoring stopped")
break
except Exception as e:
print(f"โ Error: {e}")
time.sleep(30)
if __name__ == "__main__":
manager = NetworkManager("bond0")
manager.monitor_loop()
EOF
chmod +x /usr/local/bin/smart-failover.py
echo "โ
Smart failover manager installed"
๐จ Fix Common Problems
Problem 1: Bond Not Working โ
Interfaces not bonding?
# Check module loaded
lsmod | grep bonding
# Load if missing
sudo modprobe bonding
# Check slave status
cat /proc/net/bonding/bond0
# Fix slave configuration
sudo ip link set enp0s3 down
sudo ip link set enp0s3 master bond0
sudo ip link set enp0s3 up
# Restart NetworkManager
sudo systemctl restart NetworkManager
Problem 2: No Increased Speed โ
Not getting aggregate bandwidth?
# Check bonding mode
cat /sys/class/net/bond0/bonding/mode
# Mode must support aggregation (0,2,4,5,6)
# Active-backup (mode 1) does NOT increase bandwidth
# For LACP, check switch configuration
# Switch must have LACP/802.3ad enabled
# Test with multiple streams
iperf3 -c server -P 4 # 4 parallel streams
Problem 3: Failover Not Working โ
Traffic stops on link failure?
# Check monitoring interval
cat /sys/class/net/bond0/bonding/miimon
# Should be 100 (100ms)
echo 100 | sudo tee /sys/class/net/bond0/bonding/miimon
# Check fail_over_mac setting
cat /sys/class/net/bond0/bonding/fail_over_mac
# Test failover
sudo ip link set enp0s3 down
ping -c 10 google.com # Should continue working
Problem 4: Switch Compatibility โ
LACP not negotiating?
# Check LACP status
cat /proc/net/bonding/bond0 | grep -A10 "802.3ad"
# Switch-side config needed (Cisco example):
# interface Port-channel1
# switchport mode access
# switchport access vlan 100
# interface GigabitEthernet0/1
# channel-group 1 mode active
# interface GigabitEthernet0/2
# channel-group 1 mode active
# Try simpler mode
sudo nmcli connection modify bond0 \
bond.options "mode=balance-alb"
๐ Simple Commands Summary
Task | Command |
---|---|
๐ Show bond status | cat /proc/net/bonding/bond0 |
โ Create bond | nmcli con add type bond ifname bond0 |
๐ Add slave | nmcli con add type ethernet master bond0 |
๐ Show team status | teamdctl team0 state |
๐ Test bandwidth | iperf3 -c server -P 4 |
๐ Check active slave | grep "Active Slave" /proc/net/bonding/bond0 |
โก Force failover | ip link set slave down |
๐ง Change mode | nmcli con modify bond0 bond.options mode=4 |
๐ก Tips for Success
- Test Failover ๐งช - Actually unplug cables
- Match Switch Config ๐ - LACP needs both sides
- Use Right Mode ๐ฏ - active-backup for simple HA
- Monitor Continuously ๐ - Set up alerts
- Document Setup ๐ - For future troubleshooting
- Benchmark First ๐ - Know your baseline
Funny story: We once spent hours troubleshooting โbrokenโ LACPโฆ turns out the network team configured the wrong switch ports. Always verify both ends! ๐
๐ What You Learned
Youโre now a network bonding expert! You can:
- โ Configure network bonding and teaming
- โ Set up different bonding modes
- โ Implement automatic failover
- โ Achieve load balancing
- โ Test and verify redundancy
- โ Troubleshoot bonding issues
- โ Monitor bond health
๐ฏ Why This Matters
Network bonding provides:
- ๐ก๏ธ Bulletproof connectivity
- โก Increased bandwidth
- ๐ฐ Cost-effective redundancy
- ๐ Zero-downtime maintenance
- ๐ข Enterprise reliability
- ๐ด Peace of mind
Last month, a clientโs primary network port died during Black Friday sales. The bonded backup took over instantly - zero downtime, zero lost sales. They made $50K that day and didnโt even know there was a problem until Monday! ๐ช
Remember: In networking, redundancy isnโt optional - itโs essential. One cable failure shouldnโt take down your service! ๐
Happy bonding! May your networks be redundant and your failovers seamless! ๐โจ