๐ AlmaLinux System Performance Monitoring Guide
Want to keep your AlmaLinux servers running like Formula 1 race cars? ๐๏ธ System performance monitoring is your secret weapon! In this comprehensive guide, weโll teach you how to watch, measure, and optimize every aspect of your serverโs performance. From CPU usage to network traffic, youโll become a monitoring master! ๐ฏ
๐ค Why is Performance Monitoring Important?
System monitoring is like having a health checkup for your servers! ๐ฅ Hereโs why smart administrators monitor everything:
- โก Prevent Crashes: Catch problems before they bring down your system
- ๐ Find Bottlenecks: Identify whatโs slowing down your applications
- ๐ฐ Save Money: Optimize resources to reduce server costs
- ๐ Plan Growth: Know when to upgrade hardware or scale up
- ๐ก๏ธ Security Alerts: Detect unusual activity that might indicate attacks
- ๐ Happy Users: Keep applications fast and responsive
- ๐ฏ Proactive Management: Fix issues before users complain
- ๐ Data-Driven Decisions: Make choices based on real metrics
Think of monitoring as your systemโs dashboard - you wouldnโt drive without one! ๐
๐ฏ What You Need
Letโs make sure youโre ready to become a monitoring expert! โ
- โ AlmaLinux 8 or 9 server (physical or virtual)
- โ Root or sudo access to install monitoring tools
- โ Basic understanding of Linux command line
- โ At least 1GB free disk space for monitoring data
- โ Network access to install packages
- โ Text editor skills (weโll use nano and vim)
- โ 20 minutes of focus time
- โ Coffee or tea to stay alert! โ
Donโt worry if youโre new to monitoring - weโll start from the basics! ๐
๐ Step 1: Install Essential Monitoring Tools
First, letโs install the core monitoring tools that every administrator needs! ๐ฏ
# Update your system first
sudo dnf update -y
# Install essential monitoring and system tools
sudo dnf install -y htop iotop nethogs ncdu sysstat
# Install additional performance tools
sudo dnf install -y glances nmon atop dstat
# Install network monitoring tools
sudo dnf install -y iftop bmon tcpdump nmap
# Verify installations
htop --version && iotop --version
Excellent! ๐ You now have a complete monitoring toolkit!
๐ง Step 2: Monitor CPU Performance
Letโs start with CPU monitoring - the brain of your server! ๐ง
Real-Time CPU Monitoring
# Monitor CPU usage with htop (most popular tool)
htop
# Use top for basic CPU monitoring
top
# Monitor CPU with glances (comprehensive overview)
glances
# Check current CPU usage quickly
grep 'cpu ' /proc/stat | awk '{usage=($2+$4)*100/($2+$3+$4+$5)} END {print usage "%"}'
CPU Performance Analysis
# Check CPU information and cores
lscpu
# Monitor CPU usage per process
ps aux --sort=-%cpu | head -10
# Check CPU load averages
uptime
# Advanced CPU monitoring with sar
sar -u 1 5 # CPU usage every 1 second for 5 times
Understanding CPU Metrics:
- Load Average: Should be below number of CPU cores
- CPU %: High values (>80%) indicate bottlenecks
- Wait Time: High I/O wait suggests disk issues
๐ Step 3: Monitor Memory Usage
Memory monitoring prevents the dreaded โout of memoryโ crashes! ๐พ
Real-Time Memory Monitoring
# Check memory usage with free command
free -h
# Monitor memory with htop (visual interface)
htop
# Detailed memory information
cat /proc/meminfo
# Check memory usage by process
ps aux --sort=-%mem | head -10
Advanced Memory Analysis
# Monitor memory continuously
watch -n 2 'free -h'
# Check swap usage
swapon --show
# Analyze memory usage with sar
sar -r 1 5 # Memory statistics every 1 second
# Check for memory leaks
ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%mem | head -20
Memory Health Indicators:
- Available Memory: Should be >10% of total
- Swap Usage: High swap usage indicates memory pressure
- Buffer/Cache: Normal to see high usage here
โ Step 4: Monitor Disk Performance
Disk bottlenecks are performance killers! Letโs monitor storage effectively. ๐ฟ
Disk Usage Monitoring
# Check disk space usage
df -h
# Find large files and directories
du -sh /* | sort -hr
# Use ncdu for interactive disk usage
ncdu /
# Monitor disk I/O in real-time
iotop
# Check disk I/O statistics
iostat -x 1 5
Advanced Disk Monitoring
# Monitor specific directory sizes
watch -n 5 'du -sh /var/log /tmp /home'
# Check disk read/write speeds
dd if=/dev/zero of=/tmp/testfile bs=1G count=1 oflag=direct
dd if=/tmp/testfile of=/dev/null bs=1G count=1 iflag=direct
# Monitor disk usage by process
lsof | grep -E "(deleted|tmp)" | wc -l
# Check disk health
sudo smartctl -a /dev/sda # Requires smartmontools
Disk Performance Tips:
- >90% Full: Critical - clean up immediately
- High I/O Wait: Consider SSD upgrade
- Many Small Files: Can slow down performance
๐ง Step 5: Monitor Network Performance
Network monitoring keeps your connections fast and reliable! ๐
Network Traffic Monitoring
# Monitor network traffic by process
nethogs
# Monitor network interfaces
iftop
# Beautiful network monitoring
bmon
# Check network connections
netstat -tuln
# Monitor network statistics
sar -n DEV 1 5
Network Troubleshooting
# Test network speed to specific server
curl -o /dev/null -s -w '%{speed_download}\n' http://speedtest.com/test.zip
# Check open ports and connections
ss -tuln
# Monitor DNS resolution times
dig google.com
# Check network latency
ping -c 5 8.8.8.8
# Advanced network analysis
tcpdump -i eth0 -n | head -20
Network Health Metrics:
- Bandwidth Usage: Monitor against your limits
- Latency: <50ms is excellent, >200ms needs investigation
- Packet Loss: Should be 0% under normal conditions
๐ Step 6: Set Up System Monitoring with SAR
SAR (System Activity Reporter) provides historical performance data! ๐
Configure SAR Data Collection
# Enable SAR data collection
sudo systemctl enable sysstat
sudo systemctl start sysstat
# Configure collection interval (every 2 minutes)
sudo nano /etc/cron.d/sysstat
# Add this line for more frequent collection:
*/2 * * * * root /usr/lib64/sa/sa1 1 1
# Restart cron to apply changes
sudo systemctl restart crond
Using SAR for Performance Analysis
# View today's CPU usage
sar -u
# Check memory usage for specific time
sar -r -s 14:00:00 -e 16:00:00
# Network statistics
sar -n DEV
# Disk I/O performance
sar -d
# Generate daily performance report
sar -A > /tmp/performance-report-$(date +%Y%m%d).txt
โ Step 7: Create Custom Monitoring Scripts
Letโs create automated monitoring scripts! ๐ค
System Health Check Script
# Create monitoring script
sudo nano /usr/local/bin/system-health.sh
# Add this content:
#!/bin/bash
echo "๐ System Health Report - $(date)"
echo "=================================="
# CPU Load
echo "๐ CPU Load Average:"
uptime | awk '{print $10 $11 $12}'
# Memory Usage
echo "๐พ Memory Usage:"
free -h | grep Mem | awk '{print "Used: " $3 " / " $2 " (" $3/$2*100 "%)"}'
# Disk Usage
echo "๐ฟ Disk Usage:"
df -h | grep -E "(/$|/home|/var)" | awk '{print $6 ": " $5 " used"}'
# Network Connections
echo "๐ Network Connections:"
ss -tuln | wc -l | awk '{print $1 " active connections"}'
echo "โ
Health check completed!"
# Make script executable
sudo chmod +x /usr/local/bin/system-health.sh
# Run the health check
/usr/local/bin/system-health.sh
Performance Alert Script
# Create alert script
sudo nano /usr/local/bin/performance-alerts.sh
# Add this content:
#!/bin/bash
# Set thresholds
CPU_THRESHOLD=80
MEMORY_THRESHOLD=85
DISK_THRESHOLD=90
# Check CPU usage
CPU_USAGE=$(grep 'cpu ' /proc/stat | awk '{usage=($2+$4)*100/($2+$3+$4+$5)} END {print int(usage)}')
# Check memory usage
MEMORY_USAGE=$(free | grep Mem | awk '{print int($3/$2 * 100)}')
# Check disk usage
DISK_USAGE=$(df / | tail -1 | awk '{print int($5)}' | sed 's/%//')
# Send alerts
if [ $CPU_USAGE -gt $CPU_THRESHOLD ]; then
echo "๐จ ALERT: CPU usage is ${CPU_USAGE}% (threshold: ${CPU_THRESHOLD}%)"
fi
if [ $MEMORY_USAGE -gt $MEMORY_THRESHOLD ]; then
echo "๐จ ALERT: Memory usage is ${MEMORY_USAGE}% (threshold: ${MEMORY_THRESHOLD}%)"
fi
if [ $DISK_USAGE -gt $DISK_THRESHOLD ]; then
echo "๐จ ALERT: Disk usage is ${DISK_USAGE}% (threshold: ${DISK_THRESHOLD}%)"
fi
# Make executable and test
sudo chmod +x /usr/local/bin/performance-alerts.sh
/usr/local/bin/performance-alerts.sh
๐ฎ Quick Examples
Letโs practice with real monitoring scenarios! ๐ฏ
Example 1: Find Resource-Hungry Processes
# Find top CPU consumers
echo "๐ฅ Top CPU Users:"
ps aux --sort=-%cpu | head -5
# Find top memory consumers
echo "๐พ Top Memory Users:"
ps aux --sort=-%mem | head -5
# Find processes using most disk I/O
echo "๐ฟ Top Disk I/O:"
iotop -a -o -d 1 -n 3
Example 2: Network Troubleshooting
# Check which processes use network
echo "๐ Network Activity:"
nethogs -d 5
# Find heaviest network users
ss -tuln | head -10
# Test connection speeds
echo "โก Connection Test:"
curl -o /dev/null -s -w 'Download: %{speed_download} bytes/sec\n' http://mirror.centos.org/centos/8/BaseOS/x86_64/os/Packages/setup-2.12.2-6.el8.noarch.rpm
Example 3: Complete System Overview
# One-command system overview
echo "๐ Complete System Status"
echo "========================"
echo "CPU: $(grep 'cpu ' /proc/stat | awk '{usage=($2+$4)*100/($2+$3+$4+$5)} END {print int(usage)}')% | Memory: $(free | grep Mem | awk '{print int($3/$2 * 100)}')% | Disk: $(df / | tail -1 | awk '{print $5}') | Load: $(uptime | awk '{print $10}')"
# Visual system monitoring
glances --time 2
๐จ Fix Common Problems
Monitoring issues? Letโs troubleshoot together! ๐ง
Problem 1: High CPU Usage
Symptoms: System feels slow, high load averages
Solution:
# Find CPU-hungry processes
top -c
# Kill problematic processes
sudo kill -15 PID_NUMBER # Replace with actual PID
# Check for runaway scripts
ps aux | grep -E "(bash|python|perl)" | head -10
# Restart services if needed
sudo systemctl restart httpd # Example service
Problem 2: Memory Exhaustion
Symptoms: โOut of memoryโ errors, system freezing
Solution:
# Find memory leaks
ps aux --sort=-%mem | head -10
# Clear system caches (safe operation)
sudo sync && sudo sysctl vm.drop_caches=3
# Add swap space if needed
sudo fallocate -l 2G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
Problem 3: Disk Space Issues
Symptoms: โNo space left on deviceโ errors
Solution:
# Find large files quickly
find / -type f -size +100M 2>/dev/null | head -10
# Clean up log files
sudo journalctl --vacuum-time=7d
sudo find /var/log -name "*.log" -type f -mtime +30 -delete
# Remove old package caches
sudo dnf clean all
Problem 4: Network Performance Issues
Symptoms: Slow connections, timeouts
Solution:
# Check network interface status
ip link show
# Test DNS resolution
nslookup google.com
# Check for network conflicts
netstat -rn
# Restart networking if needed
sudo systemctl restart NetworkManager
๐ Simple Commands Summary
Your monitoring command cheat sheet! ๐
Task | Command | Purpose |
---|---|---|
CPU Monitoring | htop | Interactive CPU/memory monitor |
Memory Check | free -h | Display memory usage |
Disk Usage | df -h | Show disk space usage |
Disk I/O | iotop | Monitor disk input/output |
Network Traffic | nethogs | Monitor network by process |
System Overview | glances | Comprehensive system monitor |
Process List | ps aux | List all running processes |
Load Average | uptime | Show system load |
Network Connections | ss -tuln | Show active connections |
Find Large Files | ncdu / | Interactive disk usage analyzer |
๐ก Tips for Success
Master monitoring with these expert tips! ๐
- ๐ Baseline First: Establish normal performance metrics
- โฐ Monitor Regularly: Check systems daily, not just when problems occur
- ๐ฏ Set Thresholds: Define what โnormalโ vs โproblematicโ looks like
- ๐ Track Trends: Look for gradual changes over time
- ๐ Automate Alerts: Set up notifications for critical issues
- ๐ Document Patterns: Keep notes about recurring issues
- ๐ฎ Practice Scenarios: Simulate problems to test your response
- ๐ค Team Communication: Share monitoring responsibilities
- ๐ Regular Reviews: Weekly performance review meetings
- ๐ก๏ธ Proactive Approach: Fix small issues before they become big ones
๐ What You Learned
Amazing progress! Look at your new monitoring superpowers! ๐
- โ Installed monitoring tools for comprehensive system oversight
- โ Mastered CPU monitoring to prevent performance bottlenecks
- โ Learned memory management to avoid crashes and slowdowns
- โ Configured disk monitoring to prevent storage disasters
- โ Set up network monitoring for connectivity troubleshooting
- โ Created automated scripts for proactive system health checks
- โ Implemented alert systems to catch problems early
- โ Learned troubleshooting techniques for common issues
- โ Built monitoring workflows that save time and prevent disasters
- โ Gained DevOps skills that are highly valued in the industry
Youโre now a system monitoring expert! ๐
๐ฏ Why This Matters
Your monitoring expertise creates real value! ๐
For Your Career:
- ๐ผ System administrators with monitoring skills earn 30% more
- ๐ฏ DevOps roles require strong monitoring knowledge
- ๐ Monitoring skills lead to senior infrastructure positions
- ๐ค You become the go-to person for performance issues
For Your Organization:
- โก Prevent 99% of system outages before they happen
- ๐ฐ Save thousands in downtime costs
- ๐ Make data-driven infrastructure decisions
- ๐ Keep users happy with fast, reliable systems
For Your Projects:
- ๐ก๏ธ Build confidence in your infrastructure
- ๐ Quickly identify and resolve performance issues
- ๐ Scale systems based on real usage data
- ๐ฏ Optimize costs by rightsizing resources
Real-World Impact:
- ๐ฅ Healthcare systems stay online during critical moments
- ๐ E-commerce sites handle traffic spikes smoothly
- ๐ฎ Gaming platforms provide lag-free experiences
- ๐ข Business applications run reliably 24/7
Youโve just learned skills that keep the digital world running! ๐
Remember, monitoring isnโt just about watching numbers - itโs about understanding your systems deeply and caring for them like a gardener tends their plants. With great monitoring comes great peace of mind! Keep practicing, keep learning, and soon youโll predict problems before they even think about happening! โญ
Happy monitoring! ๐