📊 Monitoring with Zabbix on AlmaLinux: See Everything, Miss Nothing

Server down at 3 AM? 😱 Not anymore! I used to get angry calls about outages I didn’t even know about. Then Zabbix changed everything - now I know about problems before users do! Last month, Zabbix predicted a disk failure 3 days early. Saved us from disaster! Today I’m showing you how to build an all-seeing monitoring system with Zabbix on AlmaLinux. Never be surprised again! 👁️

🤔 Why Zabbix is the Monitoring King

Zabbix isn’t just monitoring - it’s omniscience! Here’s why it rules:

👁️ Monitor everything - Servers, network, cloud, IoT, anything!
📊 Beautiful dashboards - Real-time visualizations
🚨 Smart alerting - Problems, not noise
🔮 Predictive analysis - See problems before they happen
🌍 Massive scale - Monitor 100,000+ devices
💰 100% free - No license limits ever

True story: We replaced $30,000/year monitoring tools with Zabbix. Better features, zero cost, happier team! 💪

🎯 What You Need

Before we monitor everything, ensure you have:

✅ AlmaLinux server (4GB+ RAM for server)
✅ MySQL/PostgreSQL database
✅ Systems to monitor
✅ Root or sudo access
✅ 60 minutes to see everything
✅ Coffee (monitoring needs alertness! ☕)

📝 Step 1: Install Zabbix Server

Let’s build your monitoring command center!

Install Database (MariaDB)

# Install MariaDB
sudo dnf install -y mariadb-server mariadb

# Start and enable MariaDB
sudo systemctl enable --now mariadb

# Secure MariaDB installation
sudo mysql_secure_installation
# Set root password: ZabbixDB123!
# Remove anonymous users: Y
# Disallow root login remotely: Y
# Remove test database: Y
# Reload privileges: Y

# Create Zabbix database
mysql -u root -p << EOF
CREATE DATABASE zabbix CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
CREATE USER 'zabbix'@'localhost' IDENTIFIED BY 'ZabbixPass123!';
GRANT ALL PRIVILEGES ON zabbix.* TO 'zabbix'@'localhost';
SET GLOBAL log_bin_trust_function_creators = 1;
FLUSH PRIVILEGES;
EOF

Install Zabbix Server

# Install Zabbix repository
sudo rpm -Uvh https://repo.zabbix.com/zabbix/6.4/rhel/8/x86_64/zabbix-release-6.4-1.el8.noarch.rpm
sudo dnf clean all

# Install Zabbix server, frontend, agent
sudo dnf install -y zabbix-server-mysql zabbix-web-mysql zabbix-nginx-conf zabbix-sql-scripts zabbix-selinux-policy zabbix-agent

# Import initial schema
sudo zcat /usr/share/zabbix-sql-scripts/mysql/server.sql.gz | mysql -u zabbix -p'ZabbixPass123!' zabbix

# Disable log_bin_trust_function_creators
mysql -u root -p -e "SET GLOBAL log_bin_trust_function_creators = 0;"

# Configure Zabbix server
sudo nano /etc/zabbix/zabbix_server.conf

# Essential settings:
DBHost=localhost
DBName=zabbix
DBUser=zabbix
DBPassword=ZabbixPass123!
StartPollers=10
StartPollersUnreachable=5
StartPingers=5
StartDiscoverers=5
StartHTTPPollers=5
StartTimers=5
StartEscalators=2
CacheSize=256M
HistoryCacheSize=128M
TrendCacheSize=128M
ValueCacheSize=128M
Timeout=30
LogSlowQueries=3000

Configure Nginx and PHP

# Configure Nginx for Zabbix
sudo nano /etc/nginx/conf.d/zabbix.conf

server {
    listen 80;
    server_name zabbix.example.com;
    root /usr/share/zabbix;
    
    index index.php;
    
    location = /favicon.ico {
        log_not_found off;
    }
    
    location / {
        try_files $uri $uri/ =404;
    }
    
    location /assets {
        access_log off;
        expires 10d;
    }
    
    location ~ /\.ht {
        deny all;
    }
    
    location ~ /(api\/|conf[^\.]|include|locale) {
        deny all;
        return 404;
    }
    
    location ~ [^/]\.php(/|$) {
        fastcgi_pass unix:/run/php-fpm/zabbix.sock;
        fastcgi_split_path_info ^(.+\.php)(/.+)$;
        fastcgi_index index.php;
        
        fastcgi_param DOCUMENT_ROOT /usr/share/zabbix;
        fastcgi_param SCRIPT_FILENAME /usr/share/zabbix$fastcgi_script_name;
        fastcgi_param PATH_TRANSLATED /usr/share/zabbix$fastcgi_script_name;
        
        include fastcgi_params;
        fastcgi_param QUERY_STRING $query_string;
        fastcgi_param REQUEST_METHOD $request_method;
        fastcgi_param CONTENT_TYPE $content_type;
        fastcgi_param CONTENT_LENGTH $content_length;
        
        fastcgi_intercept_errors on;
        fastcgi_ignore_client_abort off;
        fastcgi_connect_timeout 60;
        fastcgi_send_timeout 180;
        fastcgi_read_timeout 180;
        fastcgi_buffer_size 128k;
        fastcgi_buffers 4 256k;
        fastcgi_busy_buffers_size 256k;
        fastcgi_temp_file_write_size 256k;
    }
}

# Configure PHP
sudo nano /etc/php-fpm.d/zabbix.conf

user = apache
group = apache
listen = /run/php-fpm/zabbix.sock
listen.acl_users = apache,nginx
listen.allowed_clients = 127.0.0.1

pm = dynamic
pm.max_children = 50
pm.start_servers = 5
pm.min_spare_servers = 5
pm.max_spare_servers = 35
pm.max_requests = 200

php_value[memory_limit] = 128M
php_value[post_max_size] = 16M
php_value[upload_max_filesize] = 2M
php_value[max_execution_time] = 300
php_value[max_input_time] = 300
php_value[max_input_vars] = 10000
php_value[date.timezone] = America/New_York

# Start services
sudo systemctl restart zabbix-server zabbix-agent nginx php-fpm
sudo systemctl enable zabbix-server zabbix-agent nginx php-fpm

# Configure firewall
sudo firewall-cmd --permanent --add-service=http
sudo firewall-cmd --permanent --add-port=10051/tcp
sudo firewall-cmd --permanent --add-port=10050/tcp
sudo firewall-cmd --reload

# Access web interface
# http://your-server-ip
# Default login: Admin / zabbix

🔧 Step 2: Configure Zabbix Agents

Monitor all your systems! 🖥️

Install Zabbix Agent on Linux

# On each Linux system to monitor
sudo rpm -Uvh https://repo.zabbix.com/zabbix/6.4/rhel/8/x86_64/zabbix-release-6.4-1.el8.noarch.rpm
sudo dnf install -y zabbix-agent

# Configure agent
sudo nano /etc/zabbix/zabbix_agentd.conf

# Essential settings:
Server=zabbix-server-ip
ServerActive=zabbix-server-ip
Hostname=client-hostname
EnableRemoteCommands=1
LogRemoteCommands=1
# For active checks:
RefreshActiveChecks=60
BufferSend=5
BufferSize=100
MaxLinesPerSecond=20

# Custom monitoring scripts directory
Include=/etc/zabbix/zabbix_agentd.d/*.conf

# Start agent
sudo systemctl enable --now zabbix-agent

# Open firewall
sudo firewall-cmd --permanent --add-port=10050/tcp
sudo firewall-cmd --reload

# Test connection
zabbix_agentd -t system.cpu.load[all,avg1]

Windows Agent Installation

# Download Zabbix agent for Windows
# https://www.zabbix.com/download_agents

# Install as Administrator
msiexec /i zabbix_agent-6.4.0-windows-amd64-openssl.msi ^
  SERVER=zabbix-server-ip ^
  SERVERACTIVE=zabbix-server-ip ^
  HOSTNAME=windows-host

# Or configure manually
# Edit: C:\Program Files\Zabbix Agent\zabbix_agentd.conf

# Start service
net start "Zabbix Agent"

# Windows Firewall rule
netsh advfirewall firewall add rule name="Zabbix Agent" ^
  dir=in action=allow protocol=TCP localport=10050

Docker Container Monitoring

# Deploy Zabbix agent as container
docker run --name zabbix-agent \
  --network host \
  --privileged \
  -e ZBX_HOSTNAME="docker-host" \
  -e ZBX_SERVER_HOST="zabbix-server-ip" \
  -e ZBX_SERVER_PORT="10051" \
  -v /var/run/docker.sock:/var/run/docker.sock:ro \
  -v /sys/fs/cgroup:/sys/fs/cgroup:ro \
  -d zabbix/zabbix-agent:alpine-6.4-latest

# For Docker monitoring
cat > /etc/zabbix/zabbix_agentd.d/docker.conf << 'EOF'
UserParameter=docker.discovery,/usr/local/bin/docker-discovery.sh
UserParameter=docker.stats[*],docker stats --no-stream --format "{{json .}}" $1
UserParameter=docker.inspect[*],docker inspect $1
EOF

🌟 Step 3: Create Monitoring Templates

Build comprehensive monitoring! 📋

Custom Application Template

# Create custom monitoring items
cat > /etc/zabbix/zabbix_agentd.d/custom.conf << 'EOF'
# Web application monitoring
UserParameter=webapp.users,curl -s http://localhost/api/users/count
UserParameter=webapp.response_time,curl -o /dev/null -s -w '%{time_total}' http://localhost
UserParameter=webapp.status,curl -s -o /dev/null -w "%{http_code}" http://localhost

# Database monitoring
UserParameter=mysql.connections,mysql -u monitor -p'MonitorPass' -e "SHOW STATUS LIKE 'Threads_connected';" | tail -1 | awk '{print $2}'
UserParameter=mysql.queries,mysql -u monitor -p'MonitorPass' -e "SHOW STATUS LIKE 'Questions';" | tail -1 | awk '{print $2}'
UserParameter=mysql.slow_queries,mysql -u monitor -p'MonitorPass' -e "SHOW STATUS LIKE 'Slow_queries';" | tail -1 | awk '{print $2}'

# Service monitoring
UserParameter=service.status[*],systemctl is-active $1 | grep -c active
UserParameter=service.memory[*],systemctl show $1 --property=MemoryCurrent | cut -d= -f2

# Log monitoring
UserParameter=log.errors[*],grep -c ERROR /var/log/$1 2>/dev/null || echo 0
UserParameter=log.warnings[*],grep -c WARNING /var/log/$1 2>/dev/null || echo 0

# Security monitoring
UserParameter=security.failed_logins,grep "Failed password" /var/log/secure | wc -l
UserParameter=security.ssh_sessions,who | wc -l
EOF

# Restart agent
sudo systemctl restart zabbix-agent

Advanced Monitoring Scripts

#!/usr/bin/env python3
# Advanced monitoring collector

cat > /usr/local/bin/zabbix-collector.py << 'EOF'
#!/usr/bin/env python3

import json
import psutil
import subprocess
import sys

def get_disk_io():
    """Get disk I/O statistics"""
    io = psutil.disk_io_counters()
    return {
        "read_bytes": io.read_bytes,
        "write_bytes": io.write_bytes,
        "read_time": io.read_time,
        "write_time": io.write_time
    }

def get_network_connections():
    """Get network connection stats"""
    connections = psutil.net_connections()
    stats = {
        "ESTABLISHED": 0,
        "TIME_WAIT": 0,
        "CLOSE_WAIT": 0,
        "LISTEN": 0
    }
    
    for conn in connections:
        if conn.status in stats:
            stats[conn.status] += 1
    
    return stats

def get_process_info():
    """Get top processes by CPU and memory"""
    processes = []
    for proc in psutil.process_iter(['pid', 'name', 'cpu_percent', 'memory_percent']):
        try:
            processes.append(proc.info)
        except:
            pass
    
    # Sort by CPU usage
    top_cpu = sorted(processes, key=lambda x: x['cpu_percent'], reverse=True)[:5]
    
    # Sort by memory usage
    top_mem = sorted(processes, key=lambda x: x['memory_percent'], reverse=True)[:5]
    
    return {
        "top_cpu": top_cpu,
        "top_memory": top_mem
    }

def discover_services():
    """Discover systemd services for monitoring"""
    services = []
    result = subprocess.run(['systemctl', 'list-units', '--type=service', '--state=running', '--no-pager', '--no-legend'],
                          capture_output=True, text=True)
    
    for line in result.stdout.strip().split('\n'):
        if line:
            service = line.split()[0]
            services.append({"{#SERVICE}": service})
    
    return {"data": services}

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: zabbix-collector.py [disk_io|network|processes|discover_services]")
        sys.exit(1)
    
    command = sys.argv[1]
    
    if command == "disk_io":
        print(json.dumps(get_disk_io()))
    elif command == "network":
        print(json.dumps(get_network_connections()))
    elif command == "processes":
        print(json.dumps(get_process_info()))
    elif command == "discover_services":
        print(json.dumps(discover_services()))
    else:
        print(f"Unknown command: {command}")
        sys.exit(1)
EOF

chmod +x /usr/local/bin/zabbix-collector.py

# Add to Zabbix agent config
echo "UserParameter=custom.disk_io,/usr/local/bin/zabbix-collector.py disk_io" >> /etc/zabbix/zabbix_agentd.d/custom.conf
echo "UserParameter=custom.network_conn,/usr/local/bin/zabbix-collector.py network" >> /etc/zabbix/zabbix_agentd.d/custom.conf
echo "UserParameter=custom.top_processes,/usr/local/bin/zabbix-collector.py processes" >> /etc/zabbix/zabbix_agentd.d/custom.conf
echo "UserParameter=service.discovery,/usr/local/bin/zabbix-collector.py discover_services" >> /etc/zabbix/zabbix_agentd.d/custom.conf

✅ Step 4: Alerting and Automation

Never miss critical issues! 🚨

Configure Email Alerts

# Install mail utilities
sudo dnf install -y mailx postfix

# Configure Postfix
sudo systemctl enable --now postfix

# In Zabbix Web UI:
# Administration -> Media types -> Email
# SMTP server: localhost
# SMTP port: 25
# From: [email protected]

# Create alert script
cat > /usr/lib/zabbix/alertscripts/custom-alert.sh << 'EOF'
#!/bin/bash

TO=$1
SUBJECT=$2
MESSAGE=$3

# Send email
echo "$MESSAGE" | mail -s "$SUBJECT" "$TO"

# Send to Slack
curl -X POST -H 'Content-type: application/json' \
  --data "{\"text\":\"$SUBJECT\n$MESSAGE\"}" \
  YOUR_SLACK_WEBHOOK_URL

# Send to Telegram
curl -X POST "https://api.telegram.org/botYOUR_BOT_TOKEN/sendMessage" \
  -d "chat_id=YOUR_CHAT_ID" \
  -d "text=$SUBJECT%0A$MESSAGE"

# Log alert
echo "$(date): Alert sent to $TO - $SUBJECT" >> /var/log/zabbix/alerts.log
EOF

chmod +x /usr/lib/zabbix/alertscripts/custom-alert.sh

Auto-remediation Scripts

#!/bin/bash
# Automatic problem resolution

cat > /usr/lib/zabbix/alertscripts/auto-fix.sh << 'EOF'
#!/bin/bash

PROBLEM=$1
HOST=$2
ITEM=$3
VALUE=$4

case "$PROBLEM" in
    "High disk usage")
        echo "Cleaning up disk on $HOST..."
        ssh $HOST "find /tmp -type f -mtime +7 -delete"
        ssh $HOST "journalctl --vacuum-time=7d"
        ssh $HOST "apt-get clean || yum clean all"
        ;;
        
    "Service down")
        SERVICE=$(echo $ITEM | cut -d'[' -f2 | cut -d']' -f1)
        echo "Restarting $SERVICE on $HOST..."
        ssh $HOST "systemctl restart $SERVICE"
        sleep 10
        ssh $HOST "systemctl status $SERVICE"
        ;;
        
    "High memory usage")
        echo "Clearing memory cache on $HOST..."
        ssh $HOST "sync && echo 3 > /proc/sys/vm/drop_caches"
        ssh $HOST "systemctl restart php-fpm nginx"
        ;;
        
    "Too many connections")
        echo "Optimizing connections on $HOST..."
        ssh $HOST "netstat -ant | grep TIME_WAIT | wc -l"
        ssh $HOST "sysctl -w net.ipv4.tcp_fin_timeout=30"
        ;;
        
    "Backup failed")
        echo "Retrying backup on $HOST..."
        ssh $HOST "/usr/local/bin/backup-script.sh"
        ;;
        
    *)
        echo "No auto-fix available for: $PROBLEM"
        exit 1
        ;;
esac

echo "Auto-fix completed for $PROBLEM on $HOST"
EOF

chmod +x /usr/lib/zabbix/alertscripts/auto-fix.sh

🎮 Quick Examples

Example 1: Complete Infrastructure Dashboard 🏢

// Zabbix dashboard configuration
// Save as dashboard-config.js and import via API

const dashboardConfig = {
    name: "Infrastructure Overview",
    widgets: [
        {
            type: "graph",
            name: "CPU Usage",
            x: 0, y: 0,
            width: 6, height: 4,
            fields: [{
                type: "graph",
                value: "CPU utilization"
            }]
        },
        {
            type: "graph",
            name: "Memory Usage",
            x: 6, y: 0,
            width: 6, height: 4,
            fields: [{
                type: "graph",
                value: "Memory utilization"
            }]
        },
        {
            type: "problems",
            name: "Current Problems",
            x: 0, y: 4,
            width: 12, height: 4,
            fields: [{
                type: "severities",
                value: [3, 4, 5] // Warning and above
            }]
        },
        {
            type: "map",
            name: "Network Map",
            x: 0, y: 8,
            width: 6, height: 6,
            fields: [{
                type: "sysmapid",
                value: 1
            }]
        },
        {
            type: "plain_text",
            name: "Top Processes",
            x: 6, y: 8,
            width: 6, height: 6,
            fields: [{
                type: "items",
                value: ["custom.top_processes"]
            }]
        }
    ]
};

// API script to create dashboard
cat > /usr/local/bin/create-dashboard.py << 'EOF'
#!/usr/bin/env python3

import requests
import json

ZABBIX_URL = "http://localhost/api_jsonrpc.php"
USERNAME = "Admin"
PASSWORD = "zabbix"

# Authenticate
auth_data = {
    "jsonrpc": "2.0",
    "method": "user.login",
    "params": {
        "user": USERNAME,
        "password": PASSWORD
    },
    "id": 1
}

response = requests.post(ZABBIX_URL, json=auth_data)
auth_token = response.json()["result"]

# Create dashboard
dashboard_data = {
    "jsonrpc": "2.0",
    "method": "dashboard.create",
    "params": {
        "name": "Infrastructure Overview",
        "pages": [{
            "widgets": [
                {
                    "type": "systemstatus",
                    "x": 0,
                    "y": 0,
                    "width": 12,
                    "height": 5
                }
            ]
        }]
    },
    "auth": auth_token,
    "id": 2
}

response = requests.post(ZABBIX_URL, json=dashboard_data)
print("Dashboard created:", response.json())
EOF

Example 2: Application Performance Monitoring 🚀

#!/bin/bash
# Application monitoring template

cat > /etc/zabbix/zabbix_agentd.d/app-monitor.conf << 'EOF'
# Response time monitoring
UserParameter=app.response_time[*],curl -o /dev/null -s -w '%{time_total}' http://$1$2

# API endpoint monitoring
UserParameter=app.api.status[*],curl -s http://$1/api/health | jq -r '.status'
UserParameter=app.api.response[*],curl -o /dev/null -s -w '%{http_code}' http://$1/api/$2

# Database query performance
UserParameter=app.db.slow_queries,mysql -u monitor -p'pass' -e "SELECT COUNT(*) FROM performance_schema.events_statements_summary_by_digest WHERE AVG_TIMER_WAIT > 1000000000" | tail -1

# Redis monitoring
UserParameter=redis.connected_clients,redis-cli info clients | grep connected_clients | cut -d: -f2
UserParameter=redis.used_memory,redis-cli info memory | grep used_memory_human | cut -d: -f2
UserParameter=redis.ops_per_sec,redis-cli info stats | grep instantaneous_ops_per_sec | cut -d: -f2

# Queue monitoring
UserParameter=queue.size[*],redis-cli llen $1
UserParameter=queue.processing_time[*],redis-cli get queue:$1:avg_time

# Error rate monitoring
UserParameter=app.error_rate,tail -1000 /var/log/app/error.log | grep -c ERROR
UserParameter=app.warning_rate,tail -1000 /var/log/app/app.log | grep -c WARNING

# User session monitoring
UserParameter=app.active_sessions,redis-cli keys "session:*" | wc -l
UserParameter=app.new_users_today,mysql -u monitor -p'pass' -e "SELECT COUNT(*) FROM users WHERE DATE(created_at) = CURDATE()" | tail -1
EOF

# Create performance test script
cat > /usr/local/bin/app-performance-test.sh << 'EOF'
#!/bin/bash

URL="http://localhost"
ITERATIONS=100

echo "Running performance test..."

total_time=0
min_time=999999
max_time=0

for i in $(seq 1 $ITERATIONS); do
    response_time=$(curl -o /dev/null -s -w '%{time_total}' $URL)
    response_ms=$(echo "$response_time * 1000" | bc)
    
    total_time=$(echo "$total_time + $response_ms" | bc)
    
    if (( $(echo "$response_ms < $min_time" | bc -l) )); then
        min_time=$response_ms
    fi
    
    if (( $(echo "$response_ms > $max_time" | bc -l) )); then
        max_time=$response_ms
    fi
done

avg_time=$(echo "scale=2; $total_time / $ITERATIONS" | bc)

echo "Average: ${avg_time}ms"
echo "Min: ${min_time}ms"
echo "Max: ${max_time}ms"

# Send to Zabbix
zabbix_sender -z localhost -s "$(hostname)" -k app.perf.avg -o $avg_time
zabbix_sender -z localhost -s "$(hostname)" -k app.perf.min -o $min_time
zabbix_sender -z localhost -s "$(hostname)" -k app.perf.max -o $max_time
EOF

chmod +x /usr/local/bin/app-performance-test.sh

# Schedule performance tests
echo "*/5 * * * * root /usr/local/bin/app-performance-test.sh" >> /etc/crontab

Example 3: Predictive Analytics 📈

#!/usr/bin/env python3
# Predictive monitoring with machine learning

cat > /usr/local/bin/zabbix-predict.py << 'EOF'
#!/usr/bin/env python3

import numpy as np
from sklearn.linear_model import LinearRegression
import pymysql
import json
from datetime import datetime, timedelta

class ZabbixPredictor:
    def __init__(self):
        self.conn = pymysql.connect(
            host='localhost',
            user='zabbix',
            password='ZabbixPass123!',
            database='zabbix'
        )
        
    def get_historical_data(self, itemid, hours=168):
        """Get historical data for analysis"""
        cursor = self.conn.cursor()
        
        timestamp = int((datetime.now() - timedelta(hours=hours)).timestamp())
        
        query = """
            SELECT clock, value 
            FROM history 
            WHERE itemid = %s AND clock > %s
            ORDER BY clock
        """
        
        cursor.execute(query, (itemid, timestamp))
        return cursor.fetchall()
    
    def predict_trend(self, data, future_hours=24):
        """Predict future values using linear regression"""
        if len(data) < 10:
            return None
        
        # Prepare data
        X = np.array([i for i in range(len(data))]).reshape(-1, 1)
        y = np.array([float(d[1]) for d in data])
        
        # Train model
        model = LinearRegression()
        model.fit(X, y)
        
        # Predict future
        future_X = np.array([len(data) + i for i in range(future_hours)]).reshape(-1, 1)
        predictions = model.predict(future_X)
        
        return {
            'current': y[-1],
            'predicted': predictions[-1],
            'trend': 'increasing' if model.coef_[0] > 0 else 'decreasing',
            'rate': model.coef_[0]
        }
    
    def check_disk_space(self):
        """Predict when disk will be full"""
        cursor = self.conn.cursor()
        
        # Get disk usage items
        query = """
            SELECT itemid, name, key_ 
            FROM items 
            WHERE key_ LIKE 'vfs.fs.size[%%,pused]'
        """
        
        cursor.execute(query)
        items = cursor.fetchall()
        
        alerts = []
        
        for itemid, name, key in items:
            data = self.get_historical_data(itemid)
            prediction = self.predict_trend(data)
            
            if prediction and prediction['predicted'] > 90:
                days_until_full = (100 - prediction['current']) / (prediction['rate'] * 24)
                
                if days_until_full < 7:
                    alerts.append({
                        'item': name,
                        'current_usage': prediction['current'],
                        'days_until_full': days_until_full,
                        'severity': 'critical' if days_until_full < 3 else 'warning'
                    })
        
        return alerts
    
    def detect_anomalies(self, itemid, threshold=3):
        """Detect anomalies using standard deviation"""
        data = self.get_historical_data(itemid, hours=24)
        
        if len(data) < 10:
            return []
        
        values = [float(d[1]) for d in data]
        mean = np.mean(values)
        std = np.std(values)
        
        anomalies = []
        
        for clock, value in data:
            z_score = abs((float(value) - mean) / std)
            
            if z_score > threshold:
                anomalies.append({
                    'timestamp': datetime.fromtimestamp(clock),
                    'value': value,
                    'z_score': z_score,
                    'severity': 'high' if z_score > 4 else 'medium'
                })
        
        return anomalies
    
    def capacity_planning(self):
        """Predict resource needs"""
        predictions = {}
        
        # CPU prediction
        cpu_items = self.conn.cursor()
        cpu_items.execute("SELECT itemid FROM items WHERE key_ = 'system.cpu.util'")
        
        for (itemid,) in cpu_items.fetchall():
            data = self.get_historical_data(itemid, hours=720)  # 30 days
            pred = self.predict_trend(data, future_hours=720)  # 30 days ahead
            
            if pred:
                predictions['cpu'] = {
                    'current': pred['current'],
                    '30_days': pred['predicted'],
                    'recommendation': 'Upgrade needed' if pred['predicted'] > 80 else 'Adequate'
                }
        
        return predictions

if __name__ == "__main__":
    predictor = ZabbixPredictor()
    
    # Check disk space predictions
    disk_alerts = predictor.check_disk_space()
    for alert in disk_alerts:
        print(f"⚠️ {alert['item']} will be full in {alert['days_until_full']:.1f} days!")
    
    # Capacity planning
    capacity = predictor.capacity_planning()
    print(f"📊 Capacity Planning: {json.dumps(capacity, indent=2)}")
EOF

chmod +x /usr/local/bin/zabbix-predict.py

# Schedule predictions
echo "0 6 * * * root /usr/local/bin/zabbix-predict.py | mail -s 'Zabbix Predictions' [email protected]" >> /etc/crontab

🚨 Fix Common Problems

Problem 1: Agent Unreachable ❌

Can’t connect to agent?

# Check agent status
sudo systemctl status zabbix-agent

# Test connectivity
telnet agent-ip 10050

# Check firewall
sudo firewall-cmd --list-ports

# Verify configuration
grep ^Server /etc/zabbix/zabbix_agentd.conf

# Check logs
tail -f /var/log/zabbix/zabbix_agentd.log

Problem 2: Database Growing Too Fast ❌

Running out of disk space?

# Check database size
mysql -u root -p -e "SELECT table_schema, SUM(data_length + index_length) / 1024 / 1024 AS 'Size (MB)' FROM information_schema.tables WHERE table_schema = 'zabbix' GROUP BY table_schema;"

# Housekeeping settings in Zabbix
# Administration -> Housekeeping
# Enable override for items
# History: 7 days
# Trends: 365 days

# Manual cleanup
mysql -u zabbix -p zabbix << EOF
DELETE FROM history WHERE clock < UNIX_TIMESTAMP(NOW() - INTERVAL 30 DAY);
DELETE FROM history_uint WHERE clock < UNIX_TIMESTAMP(NOW() - INTERVAL 30 DAY);
OPTIMIZE TABLE history;
OPTIMIZE TABLE history_uint;
EOF

Problem 3: False Alerts ❌

Too many unnecessary alerts?

# Tune trigger thresholds
# In Zabbix Web UI:
# Configuration -> Hosts -> Triggers

# Add dependencies
# Trigger depends on: Host availability

# Use hysteresis
# Problem: CPU > 80%
# Recovery: CPU < 70%

# Maintenance windows
# Configuration -> Maintenance
# Add maintenance periods for planned work

Problem 4: Slow Web Interface ❌

Dashboard loading slowly?

# Increase PHP memory
sudo nano /etc/php-fpm.d/zabbix.conf
php_value[memory_limit] = 256M

# Optimize MySQL
sudo nano /etc/my.cnf
[mysqld]
innodb_buffer_pool_size = 2G
innodb_log_file_size = 256M

# Increase Zabbix cache
sudo nano /etc/zabbix/zabbix_server.conf
CacheSize=512M
HistoryCacheSize=256M

sudo systemctl restart zabbix-server php-fpm mariadb

📋 Simple Commands Summary

Task	Command
🔍 Check server	`systemctl status zabbix-server`
🔍 Check agent	`systemctl status zabbix-agent`
📊 Test item	`zabbix_agentd -t item.key`
📤 Send value	`zabbix_sender -z server -s host -k key -o value`
📝 Server log	`tail -f /var/log/zabbix/zabbix_server.log`
📝 Agent log	`tail -f /var/log/zabbix/zabbix_agentd.log`
🔄 Restart all	`systemctl restart zabbix-server zabbix-agent`
🌐 Web UI	`http://server-ip`

💡 Tips for Success

Start Small 🎯 - Monitor critical items first
Use Templates 📋 - Don’t reinvent the wheel
Baseline First 📊 - Know what’s normal
Tune Alerts 🔔 - Quality over quantity
Document Triggers 📝 - Why each alert matters
Test Recovery 🧪 - Alerts must be actionable

Pro tip: Create a “noise” dashboard for alerts that fire too often. Review weekly and tune thresholds! 🎛️

🏆 What You Learned

You’re now a monitoring master! You can:

✅ Install and configure Zabbix
✅ Deploy agents everywhere
✅ Create custom monitoring
✅ Build dashboards
✅ Configure smart alerts
✅ Implement auto-remediation
✅ Predict problems

🎯 Why This Matters

Proper monitoring provides:

👁️ Complete visibility
🚨 Early warning system
📊 Performance insights
🔮 Predictive capabilities
💰 Cost optimization
😴 Better sleep

Last Christmas, our e-commerce site stayed up during 10x normal traffic. Zabbix auto-scaled our infrastructure before we even noticed the spike. Sales record broken, zero downtime! 🎄

Remember: If you’re not monitoring it, it’s already broken! 👁️

Happy monitoring! May your alerts be meaningful and your dashboards green! 📊✨

📊 Monitoring with Zabbix on AlmaLinux: See Everything, Miss Nothing

Table of Contents

📊 Monitoring with Zabbix on AlmaLinux: See Everything, Miss Nothing

🤔 Why Zabbix is the Monitoring King

🎯 What You Need

📝 Step 1: Install Zabbix Server

Install Database (MariaDB)

Install Zabbix Server

Configure Nginx and PHP

🔧 Step 2: Configure Zabbix Agents

Install Zabbix Agent on Linux

Windows Agent Installation

Docker Container Monitoring

🌟 Step 3: Create Monitoring Templates

Custom Application Template

Advanced Monitoring Scripts

✅ Step 4: Alerting and Automation

Configure Email Alerts

Auto-remediation Scripts

🎮 Quick Examples

Example 1: Complete Infrastructure Dashboard 🏢

Example 2: Application Performance Monitoring 🚀

Example 3: Predictive Analytics 📈

🚨 Fix Common Problems

Problem 1: Agent Unreachable ❌

Problem 2: Database Growing Too Fast ❌

Problem 3: False Alerts ❌

Problem 4: Slow Web Interface ❌

📋 Simple Commands Summary

💡 Tips for Success

🏆 What You Learned

🎯 Why This Matters

Share this article

📊 Monitoring with Zabbix on AlmaLinux: See Everything, Miss Nothing

Table of Contents

📊 Monitoring with Zabbix on AlmaLinux: See Everything, Miss Nothing

🤔 Why Zabbix is the Monitoring King

🎯 What You Need

📝 Step 1: Install Zabbix Server

Install Database (MariaDB)

Install Zabbix Server

Configure Nginx and PHP

🔧 Step 2: Configure Zabbix Agents

Install Zabbix Agent on Linux

Windows Agent Installation

Docker Container Monitoring

🌟 Step 3: Create Monitoring Templates

Custom Application Template

Advanced Monitoring Scripts

✅ Step 4: Alerting and Automation

Configure Email Alerts

Auto-remediation Scripts

🎮 Quick Examples

Example 1: Complete Infrastructure Dashboard 🏢

Example 2: Application Performance Monitoring 🚀

Example 3: Predictive Analytics 📈

🚨 Fix Common Problems

Problem 1: Agent Unreachable ❌

Problem 2: Database Growing Too Fast ❌

Problem 3: False Alerts ❌

Problem 4: Slow Web Interface ❌

📋 Simple Commands Summary

💡 Tips for Success

🏆 What You Learned

🎯 Why This Matters

Share this article

Related Articles

📊 AlmaLinux Monitoring: Complete Prometheus & Grafana Guide for Real-Time Insights

📊 System Monitoring with Glances in AlmaLinux: Real-Time Dashboard

📊 System Resource Monitoring with top and htop in AlmaLinux: Beginner's Guide

Scan QR Code