kotlin
helm
sql
+
+
numpy
+
+
travis
+
...
+
+
+
vb
ionic
phpstorm
phoenix
+
+
gulp
+
+
delphi
+
+
eslint
ios
https
+
+
+
phpstorm
+
โˆš
+
couchdb
gin
+
stencil
+
+
cosmos
smtp
+
vault
โ‰ˆ
+
+
+
+
+
docker
0b
+
swc
elixir
+
%
+
+
+
+
+
+
cargo
matplotlib
+
express
travis
+
rs
+
+
rollup
influxdb
swift
+
f#
+
+
alpine
+
vb
+
vault
echo
+
+
Back to Blog
๐Ÿ“Š Monitoring with Zabbix on AlmaLinux: See Everything, Miss Nothing
AlmaLinux Zabbix Monitoring

๐Ÿ“Š Monitoring with Zabbix on AlmaLinux: See Everything, Miss Nothing

Published Aug 22, 2025

Deploy Zabbix monitoring on AlmaLinux. Install server and agents, configure monitoring, create dashboards, set up alerts, and implement proactive monitoring with examples.

14 min read
0 views
Table of Contents

๐Ÿ“Š Monitoring with Zabbix on AlmaLinux: See Everything, Miss Nothing

Server down at 3 AM? ๐Ÿ˜ฑ Not anymore! I used to get angry calls about outages I didnโ€™t even know about. Then Zabbix changed everything - now I know about problems before users do! Last month, Zabbix predicted a disk failure 3 days early. Saved us from disaster! Today Iโ€™m showing you how to build an all-seeing monitoring system with Zabbix on AlmaLinux. Never be surprised again! ๐Ÿ‘๏ธ

๐Ÿค” Why Zabbix is the Monitoring King

Zabbix isnโ€™t just monitoring - itโ€™s omniscience! Hereโ€™s why it rules:

  • ๐Ÿ‘๏ธ Monitor everything - Servers, network, cloud, IoT, anything!
  • ๐Ÿ“Š Beautiful dashboards - Real-time visualizations
  • ๐Ÿšจ Smart alerting - Problems, not noise
  • ๐Ÿ”ฎ Predictive analysis - See problems before they happen
  • ๐ŸŒ Massive scale - Monitor 100,000+ devices
  • ๐Ÿ’ฐ 100% free - No license limits ever

True story: We replaced $30,000/year monitoring tools with Zabbix. Better features, zero cost, happier team! ๐Ÿ’ช

๐ŸŽฏ What You Need

Before we monitor everything, ensure you have:

  • โœ… AlmaLinux server (4GB+ RAM for server)
  • โœ… MySQL/PostgreSQL database
  • โœ… Systems to monitor
  • โœ… Root or sudo access
  • โœ… 60 minutes to see everything
  • โœ… Coffee (monitoring needs alertness! โ˜•)

๐Ÿ“ Step 1: Install Zabbix Server

Letโ€™s build your monitoring command center!

Install Database (MariaDB)

# Install MariaDB
sudo dnf install -y mariadb-server mariadb

# Start and enable MariaDB
sudo systemctl enable --now mariadb

# Secure MariaDB installation
sudo mysql_secure_installation
# Set root password: ZabbixDB123!
# Remove anonymous users: Y
# Disallow root login remotely: Y
# Remove test database: Y
# Reload privileges: Y

# Create Zabbix database
mysql -u root -p << EOF
CREATE DATABASE zabbix CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
CREATE USER 'zabbix'@'localhost' IDENTIFIED BY 'ZabbixPass123!';
GRANT ALL PRIVILEGES ON zabbix.* TO 'zabbix'@'localhost';
SET GLOBAL log_bin_trust_function_creators = 1;
FLUSH PRIVILEGES;
EOF

Install Zabbix Server

# Install Zabbix repository
sudo rpm -Uvh https://repo.zabbix.com/zabbix/6.4/rhel/8/x86_64/zabbix-release-6.4-1.el8.noarch.rpm
sudo dnf clean all

# Install Zabbix server, frontend, agent
sudo dnf install -y zabbix-server-mysql zabbix-web-mysql zabbix-nginx-conf zabbix-sql-scripts zabbix-selinux-policy zabbix-agent

# Import initial schema
sudo zcat /usr/share/zabbix-sql-scripts/mysql/server.sql.gz | mysql -u zabbix -p'ZabbixPass123!' zabbix

# Disable log_bin_trust_function_creators
mysql -u root -p -e "SET GLOBAL log_bin_trust_function_creators = 0;"

# Configure Zabbix server
sudo nano /etc/zabbix/zabbix_server.conf

# Essential settings:
DBHost=localhost
DBName=zabbix
DBUser=zabbix
DBPassword=ZabbixPass123!
StartPollers=10
StartPollersUnreachable=5
StartPingers=5
StartDiscoverers=5
StartHTTPPollers=5
StartTimers=5
StartEscalators=2
CacheSize=256M
HistoryCacheSize=128M
TrendCacheSize=128M
ValueCacheSize=128M
Timeout=30
LogSlowQueries=3000

Configure Nginx and PHP

# Configure Nginx for Zabbix
sudo nano /etc/nginx/conf.d/zabbix.conf

server {
    listen 80;
    server_name zabbix.example.com;
    root /usr/share/zabbix;
    
    index index.php;
    
    location = /favicon.ico {
        log_not_found off;
    }
    
    location / {
        try_files $uri $uri/ =404;
    }
    
    location /assets {
        access_log off;
        expires 10d;
    }
    
    location ~ /\.ht {
        deny all;
    }
    
    location ~ /(api\/|conf[^\.]|include|locale) {
        deny all;
        return 404;
    }
    
    location ~ [^/]\.php(/|$) {
        fastcgi_pass unix:/run/php-fpm/zabbix.sock;
        fastcgi_split_path_info ^(.+\.php)(/.+)$;
        fastcgi_index index.php;
        
        fastcgi_param DOCUMENT_ROOT /usr/share/zabbix;
        fastcgi_param SCRIPT_FILENAME /usr/share/zabbix$fastcgi_script_name;
        fastcgi_param PATH_TRANSLATED /usr/share/zabbix$fastcgi_script_name;
        
        include fastcgi_params;
        fastcgi_param QUERY_STRING $query_string;
        fastcgi_param REQUEST_METHOD $request_method;
        fastcgi_param CONTENT_TYPE $content_type;
        fastcgi_param CONTENT_LENGTH $content_length;
        
        fastcgi_intercept_errors on;
        fastcgi_ignore_client_abort off;
        fastcgi_connect_timeout 60;
        fastcgi_send_timeout 180;
        fastcgi_read_timeout 180;
        fastcgi_buffer_size 128k;
        fastcgi_buffers 4 256k;
        fastcgi_busy_buffers_size 256k;
        fastcgi_temp_file_write_size 256k;
    }
}

# Configure PHP
sudo nano /etc/php-fpm.d/zabbix.conf

user = apache
group = apache
listen = /run/php-fpm/zabbix.sock
listen.acl_users = apache,nginx
listen.allowed_clients = 127.0.0.1

pm = dynamic
pm.max_children = 50
pm.start_servers = 5
pm.min_spare_servers = 5
pm.max_spare_servers = 35
pm.max_requests = 200

php_value[memory_limit] = 128M
php_value[post_max_size] = 16M
php_value[upload_max_filesize] = 2M
php_value[max_execution_time] = 300
php_value[max_input_time] = 300
php_value[max_input_vars] = 10000
php_value[date.timezone] = America/New_York

# Start services
sudo systemctl restart zabbix-server zabbix-agent nginx php-fpm
sudo systemctl enable zabbix-server zabbix-agent nginx php-fpm

# Configure firewall
sudo firewall-cmd --permanent --add-service=http
sudo firewall-cmd --permanent --add-port=10051/tcp
sudo firewall-cmd --permanent --add-port=10050/tcp
sudo firewall-cmd --reload

# Access web interface
# http://your-server-ip
# Default login: Admin / zabbix

๐Ÿ”ง Step 2: Configure Zabbix Agents

Monitor all your systems! ๐Ÿ–ฅ๏ธ

Install Zabbix Agent on Linux

# On each Linux system to monitor
sudo rpm -Uvh https://repo.zabbix.com/zabbix/6.4/rhel/8/x86_64/zabbix-release-6.4-1.el8.noarch.rpm
sudo dnf install -y zabbix-agent

# Configure agent
sudo nano /etc/zabbix/zabbix_agentd.conf

# Essential settings:
Server=zabbix-server-ip
ServerActive=zabbix-server-ip
Hostname=client-hostname
EnableRemoteCommands=1
LogRemoteCommands=1
# For active checks:
RefreshActiveChecks=60
BufferSend=5
BufferSize=100
MaxLinesPerSecond=20

# Custom monitoring scripts directory
Include=/etc/zabbix/zabbix_agentd.d/*.conf

# Start agent
sudo systemctl enable --now zabbix-agent

# Open firewall
sudo firewall-cmd --permanent --add-port=10050/tcp
sudo firewall-cmd --reload

# Test connection
zabbix_agentd -t system.cpu.load[all,avg1]

Windows Agent Installation

# Download Zabbix agent for Windows
# https://www.zabbix.com/download_agents

# Install as Administrator
msiexec /i zabbix_agent-6.4.0-windows-amd64-openssl.msi ^
  SERVER=zabbix-server-ip ^
  SERVERACTIVE=zabbix-server-ip ^
  HOSTNAME=windows-host

# Or configure manually
# Edit: C:\Program Files\Zabbix Agent\zabbix_agentd.conf

# Start service
net start "Zabbix Agent"

# Windows Firewall rule
netsh advfirewall firewall add rule name="Zabbix Agent" ^
  dir=in action=allow protocol=TCP localport=10050

Docker Container Monitoring

# Deploy Zabbix agent as container
docker run --name zabbix-agent \
  --network host \
  --privileged \
  -e ZBX_HOSTNAME="docker-host" \
  -e ZBX_SERVER_HOST="zabbix-server-ip" \
  -e ZBX_SERVER_PORT="10051" \
  -v /var/run/docker.sock:/var/run/docker.sock:ro \
  -v /sys/fs/cgroup:/sys/fs/cgroup:ro \
  -d zabbix/zabbix-agent:alpine-6.4-latest

# For Docker monitoring
cat > /etc/zabbix/zabbix_agentd.d/docker.conf << 'EOF'
UserParameter=docker.discovery,/usr/local/bin/docker-discovery.sh
UserParameter=docker.stats[*],docker stats --no-stream --format "{{json .}}" $1
UserParameter=docker.inspect[*],docker inspect $1
EOF

๐ŸŒŸ Step 3: Create Monitoring Templates

Build comprehensive monitoring! ๐Ÿ“‹

Custom Application Template

# Create custom monitoring items
cat > /etc/zabbix/zabbix_agentd.d/custom.conf << 'EOF'
# Web application monitoring
UserParameter=webapp.users,curl -s http://localhost/api/users/count
UserParameter=webapp.response_time,curl -o /dev/null -s -w '%{time_total}' http://localhost
UserParameter=webapp.status,curl -s -o /dev/null -w "%{http_code}" http://localhost

# Database monitoring
UserParameter=mysql.connections,mysql -u monitor -p'MonitorPass' -e "SHOW STATUS LIKE 'Threads_connected';" | tail -1 | awk '{print $2}'
UserParameter=mysql.queries,mysql -u monitor -p'MonitorPass' -e "SHOW STATUS LIKE 'Questions';" | tail -1 | awk '{print $2}'
UserParameter=mysql.slow_queries,mysql -u monitor -p'MonitorPass' -e "SHOW STATUS LIKE 'Slow_queries';" | tail -1 | awk '{print $2}'

# Service monitoring
UserParameter=service.status[*],systemctl is-active $1 | grep -c active
UserParameter=service.memory[*],systemctl show $1 --property=MemoryCurrent | cut -d= -f2

# Log monitoring
UserParameter=log.errors[*],grep -c ERROR /var/log/$1 2>/dev/null || echo 0
UserParameter=log.warnings[*],grep -c WARNING /var/log/$1 2>/dev/null || echo 0

# Security monitoring
UserParameter=security.failed_logins,grep "Failed password" /var/log/secure | wc -l
UserParameter=security.ssh_sessions,who | wc -l
EOF

# Restart agent
sudo systemctl restart zabbix-agent

Advanced Monitoring Scripts

#!/usr/bin/env python3
# Advanced monitoring collector

cat > /usr/local/bin/zabbix-collector.py << 'EOF'
#!/usr/bin/env python3

import json
import psutil
import subprocess
import sys

def get_disk_io():
    """Get disk I/O statistics"""
    io = psutil.disk_io_counters()
    return {
        "read_bytes": io.read_bytes,
        "write_bytes": io.write_bytes,
        "read_time": io.read_time,
        "write_time": io.write_time
    }

def get_network_connections():
    """Get network connection stats"""
    connections = psutil.net_connections()
    stats = {
        "ESTABLISHED": 0,
        "TIME_WAIT": 0,
        "CLOSE_WAIT": 0,
        "LISTEN": 0
    }
    
    for conn in connections:
        if conn.status in stats:
            stats[conn.status] += 1
    
    return stats

def get_process_info():
    """Get top processes by CPU and memory"""
    processes = []
    for proc in psutil.process_iter(['pid', 'name', 'cpu_percent', 'memory_percent']):
        try:
            processes.append(proc.info)
        except:
            pass
    
    # Sort by CPU usage
    top_cpu = sorted(processes, key=lambda x: x['cpu_percent'], reverse=True)[:5]
    
    # Sort by memory usage
    top_mem = sorted(processes, key=lambda x: x['memory_percent'], reverse=True)[:5]
    
    return {
        "top_cpu": top_cpu,
        "top_memory": top_mem
    }

def discover_services():
    """Discover systemd services for monitoring"""
    services = []
    result = subprocess.run(['systemctl', 'list-units', '--type=service', '--state=running', '--no-pager', '--no-legend'],
                          capture_output=True, text=True)
    
    for line in result.stdout.strip().split('\n'):
        if line:
            service = line.split()[0]
            services.append({"{#SERVICE}": service})
    
    return {"data": services}

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: zabbix-collector.py [disk_io|network|processes|discover_services]")
        sys.exit(1)
    
    command = sys.argv[1]
    
    if command == "disk_io":
        print(json.dumps(get_disk_io()))
    elif command == "network":
        print(json.dumps(get_network_connections()))
    elif command == "processes":
        print(json.dumps(get_process_info()))
    elif command == "discover_services":
        print(json.dumps(discover_services()))
    else:
        print(f"Unknown command: {command}")
        sys.exit(1)
EOF

chmod +x /usr/local/bin/zabbix-collector.py

# Add to Zabbix agent config
echo "UserParameter=custom.disk_io,/usr/local/bin/zabbix-collector.py disk_io" >> /etc/zabbix/zabbix_agentd.d/custom.conf
echo "UserParameter=custom.network_conn,/usr/local/bin/zabbix-collector.py network" >> /etc/zabbix/zabbix_agentd.d/custom.conf
echo "UserParameter=custom.top_processes,/usr/local/bin/zabbix-collector.py processes" >> /etc/zabbix/zabbix_agentd.d/custom.conf
echo "UserParameter=service.discovery,/usr/local/bin/zabbix-collector.py discover_services" >> /etc/zabbix/zabbix_agentd.d/custom.conf

โœ… Step 4: Alerting and Automation

Never miss critical issues! ๐Ÿšจ

Configure Email Alerts

# Install mail utilities
sudo dnf install -y mailx postfix

# Configure Postfix
sudo systemctl enable --now postfix

# In Zabbix Web UI:
# Administration -> Media types -> Email
# SMTP server: localhost
# SMTP port: 25
# From: [email protected]

# Create alert script
cat > /usr/lib/zabbix/alertscripts/custom-alert.sh << 'EOF'
#!/bin/bash

TO=$1
SUBJECT=$2
MESSAGE=$3

# Send email
echo "$MESSAGE" | mail -s "$SUBJECT" "$TO"

# Send to Slack
curl -X POST -H 'Content-type: application/json' \
  --data "{\"text\":\"$SUBJECT\n$MESSAGE\"}" \
  YOUR_SLACK_WEBHOOK_URL

# Send to Telegram
curl -X POST "https://api.telegram.org/botYOUR_BOT_TOKEN/sendMessage" \
  -d "chat_id=YOUR_CHAT_ID" \
  -d "text=$SUBJECT%0A$MESSAGE"

# Log alert
echo "$(date): Alert sent to $TO - $SUBJECT" >> /var/log/zabbix/alerts.log
EOF

chmod +x /usr/lib/zabbix/alertscripts/custom-alert.sh

Auto-remediation Scripts

#!/bin/bash
# Automatic problem resolution

cat > /usr/lib/zabbix/alertscripts/auto-fix.sh << 'EOF'
#!/bin/bash

PROBLEM=$1
HOST=$2
ITEM=$3
VALUE=$4

case "$PROBLEM" in
    "High disk usage")
        echo "Cleaning up disk on $HOST..."
        ssh $HOST "find /tmp -type f -mtime +7 -delete"
        ssh $HOST "journalctl --vacuum-time=7d"
        ssh $HOST "apt-get clean || yum clean all"
        ;;
        
    "Service down")
        SERVICE=$(echo $ITEM | cut -d'[' -f2 | cut -d']' -f1)
        echo "Restarting $SERVICE on $HOST..."
        ssh $HOST "systemctl restart $SERVICE"
        sleep 10
        ssh $HOST "systemctl status $SERVICE"
        ;;
        
    "High memory usage")
        echo "Clearing memory cache on $HOST..."
        ssh $HOST "sync && echo 3 > /proc/sys/vm/drop_caches"
        ssh $HOST "systemctl restart php-fpm nginx"
        ;;
        
    "Too many connections")
        echo "Optimizing connections on $HOST..."
        ssh $HOST "netstat -ant | grep TIME_WAIT | wc -l"
        ssh $HOST "sysctl -w net.ipv4.tcp_fin_timeout=30"
        ;;
        
    "Backup failed")
        echo "Retrying backup on $HOST..."
        ssh $HOST "/usr/local/bin/backup-script.sh"
        ;;
        
    *)
        echo "No auto-fix available for: $PROBLEM"
        exit 1
        ;;
esac

echo "Auto-fix completed for $PROBLEM on $HOST"
EOF

chmod +x /usr/lib/zabbix/alertscripts/auto-fix.sh

๐ŸŽฎ Quick Examples

Example 1: Complete Infrastructure Dashboard ๐Ÿข

// Zabbix dashboard configuration
// Save as dashboard-config.js and import via API

const dashboardConfig = {
    name: "Infrastructure Overview",
    widgets: [
        {
            type: "graph",
            name: "CPU Usage",
            x: 0, y: 0,
            width: 6, height: 4,
            fields: [{
                type: "graph",
                value: "CPU utilization"
            }]
        },
        {
            type: "graph",
            name: "Memory Usage",
            x: 6, y: 0,
            width: 6, height: 4,
            fields: [{
                type: "graph",
                value: "Memory utilization"
            }]
        },
        {
            type: "problems",
            name: "Current Problems",
            x: 0, y: 4,
            width: 12, height: 4,
            fields: [{
                type: "severities",
                value: [3, 4, 5] // Warning and above
            }]
        },
        {
            type: "map",
            name: "Network Map",
            x: 0, y: 8,
            width: 6, height: 6,
            fields: [{
                type: "sysmapid",
                value: 1
            }]
        },
        {
            type: "plain_text",
            name: "Top Processes",
            x: 6, y: 8,
            width: 6, height: 6,
            fields: [{
                type: "items",
                value: ["custom.top_processes"]
            }]
        }
    ]
};

// API script to create dashboard
cat > /usr/local/bin/create-dashboard.py << 'EOF'
#!/usr/bin/env python3

import requests
import json

ZABBIX_URL = "http://localhost/api_jsonrpc.php"
USERNAME = "Admin"
PASSWORD = "zabbix"

# Authenticate
auth_data = {
    "jsonrpc": "2.0",
    "method": "user.login",
    "params": {
        "user": USERNAME,
        "password": PASSWORD
    },
    "id": 1
}

response = requests.post(ZABBIX_URL, json=auth_data)
auth_token = response.json()["result"]

# Create dashboard
dashboard_data = {
    "jsonrpc": "2.0",
    "method": "dashboard.create",
    "params": {
        "name": "Infrastructure Overview",
        "pages": [{
            "widgets": [
                {
                    "type": "systemstatus",
                    "x": 0,
                    "y": 0,
                    "width": 12,
                    "height": 5
                }
            ]
        }]
    },
    "auth": auth_token,
    "id": 2
}

response = requests.post(ZABBIX_URL, json=dashboard_data)
print("Dashboard created:", response.json())
EOF

Example 2: Application Performance Monitoring ๐Ÿš€

#!/bin/bash
# Application monitoring template

cat > /etc/zabbix/zabbix_agentd.d/app-monitor.conf << 'EOF'
# Response time monitoring
UserParameter=app.response_time[*],curl -o /dev/null -s -w '%{time_total}' http://$1$2

# API endpoint monitoring
UserParameter=app.api.status[*],curl -s http://$1/api/health | jq -r '.status'
UserParameter=app.api.response[*],curl -o /dev/null -s -w '%{http_code}' http://$1/api/$2

# Database query performance
UserParameter=app.db.slow_queries,mysql -u monitor -p'pass' -e "SELECT COUNT(*) FROM performance_schema.events_statements_summary_by_digest WHERE AVG_TIMER_WAIT > 1000000000" | tail -1

# Redis monitoring
UserParameter=redis.connected_clients,redis-cli info clients | grep connected_clients | cut -d: -f2
UserParameter=redis.used_memory,redis-cli info memory | grep used_memory_human | cut -d: -f2
UserParameter=redis.ops_per_sec,redis-cli info stats | grep instantaneous_ops_per_sec | cut -d: -f2

# Queue monitoring
UserParameter=queue.size[*],redis-cli llen $1
UserParameter=queue.processing_time[*],redis-cli get queue:$1:avg_time

# Error rate monitoring
UserParameter=app.error_rate,tail -1000 /var/log/app/error.log | grep -c ERROR
UserParameter=app.warning_rate,tail -1000 /var/log/app/app.log | grep -c WARNING

# User session monitoring
UserParameter=app.active_sessions,redis-cli keys "session:*" | wc -l
UserParameter=app.new_users_today,mysql -u monitor -p'pass' -e "SELECT COUNT(*) FROM users WHERE DATE(created_at) = CURDATE()" | tail -1
EOF

# Create performance test script
cat > /usr/local/bin/app-performance-test.sh << 'EOF'
#!/bin/bash

URL="http://localhost"
ITERATIONS=100

echo "Running performance test..."

total_time=0
min_time=999999
max_time=0

for i in $(seq 1 $ITERATIONS); do
    response_time=$(curl -o /dev/null -s -w '%{time_total}' $URL)
    response_ms=$(echo "$response_time * 1000" | bc)
    
    total_time=$(echo "$total_time + $response_ms" | bc)
    
    if (( $(echo "$response_ms < $min_time" | bc -l) )); then
        min_time=$response_ms
    fi
    
    if (( $(echo "$response_ms > $max_time" | bc -l) )); then
        max_time=$response_ms
    fi
done

avg_time=$(echo "scale=2; $total_time / $ITERATIONS" | bc)

echo "Average: ${avg_time}ms"
echo "Min: ${min_time}ms"
echo "Max: ${max_time}ms"

# Send to Zabbix
zabbix_sender -z localhost -s "$(hostname)" -k app.perf.avg -o $avg_time
zabbix_sender -z localhost -s "$(hostname)" -k app.perf.min -o $min_time
zabbix_sender -z localhost -s "$(hostname)" -k app.perf.max -o $max_time
EOF

chmod +x /usr/local/bin/app-performance-test.sh

# Schedule performance tests
echo "*/5 * * * * root /usr/local/bin/app-performance-test.sh" >> /etc/crontab

Example 3: Predictive Analytics ๐Ÿ“ˆ

#!/usr/bin/env python3
# Predictive monitoring with machine learning

cat > /usr/local/bin/zabbix-predict.py << 'EOF'
#!/usr/bin/env python3

import numpy as np
from sklearn.linear_model import LinearRegression
import pymysql
import json
from datetime import datetime, timedelta

class ZabbixPredictor:
    def __init__(self):
        self.conn = pymysql.connect(
            host='localhost',
            user='zabbix',
            password='ZabbixPass123!',
            database='zabbix'
        )
        
    def get_historical_data(self, itemid, hours=168):
        """Get historical data for analysis"""
        cursor = self.conn.cursor()
        
        timestamp = int((datetime.now() - timedelta(hours=hours)).timestamp())
        
        query = """
            SELECT clock, value 
            FROM history 
            WHERE itemid = %s AND clock > %s
            ORDER BY clock
        """
        
        cursor.execute(query, (itemid, timestamp))
        return cursor.fetchall()
    
    def predict_trend(self, data, future_hours=24):
        """Predict future values using linear regression"""
        if len(data) < 10:
            return None
        
        # Prepare data
        X = np.array([i for i in range(len(data))]).reshape(-1, 1)
        y = np.array([float(d[1]) for d in data])
        
        # Train model
        model = LinearRegression()
        model.fit(X, y)
        
        # Predict future
        future_X = np.array([len(data) + i for i in range(future_hours)]).reshape(-1, 1)
        predictions = model.predict(future_X)
        
        return {
            'current': y[-1],
            'predicted': predictions[-1],
            'trend': 'increasing' if model.coef_[0] > 0 else 'decreasing',
            'rate': model.coef_[0]
        }
    
    def check_disk_space(self):
        """Predict when disk will be full"""
        cursor = self.conn.cursor()
        
        # Get disk usage items
        query = """
            SELECT itemid, name, key_ 
            FROM items 
            WHERE key_ LIKE 'vfs.fs.size[%%,pused]'
        """
        
        cursor.execute(query)
        items = cursor.fetchall()
        
        alerts = []
        
        for itemid, name, key in items:
            data = self.get_historical_data(itemid)
            prediction = self.predict_trend(data)
            
            if prediction and prediction['predicted'] > 90:
                days_until_full = (100 - prediction['current']) / (prediction['rate'] * 24)
                
                if days_until_full < 7:
                    alerts.append({
                        'item': name,
                        'current_usage': prediction['current'],
                        'days_until_full': days_until_full,
                        'severity': 'critical' if days_until_full < 3 else 'warning'
                    })
        
        return alerts
    
    def detect_anomalies(self, itemid, threshold=3):
        """Detect anomalies using standard deviation"""
        data = self.get_historical_data(itemid, hours=24)
        
        if len(data) < 10:
            return []
        
        values = [float(d[1]) for d in data]
        mean = np.mean(values)
        std = np.std(values)
        
        anomalies = []
        
        for clock, value in data:
            z_score = abs((float(value) - mean) / std)
            
            if z_score > threshold:
                anomalies.append({
                    'timestamp': datetime.fromtimestamp(clock),
                    'value': value,
                    'z_score': z_score,
                    'severity': 'high' if z_score > 4 else 'medium'
                })
        
        return anomalies
    
    def capacity_planning(self):
        """Predict resource needs"""
        predictions = {}
        
        # CPU prediction
        cpu_items = self.conn.cursor()
        cpu_items.execute("SELECT itemid FROM items WHERE key_ = 'system.cpu.util'")
        
        for (itemid,) in cpu_items.fetchall():
            data = self.get_historical_data(itemid, hours=720)  # 30 days
            pred = self.predict_trend(data, future_hours=720)  # 30 days ahead
            
            if pred:
                predictions['cpu'] = {
                    'current': pred['current'],
                    '30_days': pred['predicted'],
                    'recommendation': 'Upgrade needed' if pred['predicted'] > 80 else 'Adequate'
                }
        
        return predictions

if __name__ == "__main__":
    predictor = ZabbixPredictor()
    
    # Check disk space predictions
    disk_alerts = predictor.check_disk_space()
    for alert in disk_alerts:
        print(f"โš ๏ธ {alert['item']} will be full in {alert['days_until_full']:.1f} days!")
    
    # Capacity planning
    capacity = predictor.capacity_planning()
    print(f"๐Ÿ“Š Capacity Planning: {json.dumps(capacity, indent=2)}")
EOF

chmod +x /usr/local/bin/zabbix-predict.py

# Schedule predictions
echo "0 6 * * * root /usr/local/bin/zabbix-predict.py | mail -s 'Zabbix Predictions' [email protected]" >> /etc/crontab

๐Ÿšจ Fix Common Problems

Problem 1: Agent Unreachable โŒ

Canโ€™t connect to agent?

# Check agent status
sudo systemctl status zabbix-agent

# Test connectivity
telnet agent-ip 10050

# Check firewall
sudo firewall-cmd --list-ports

# Verify configuration
grep ^Server /etc/zabbix/zabbix_agentd.conf

# Check logs
tail -f /var/log/zabbix/zabbix_agentd.log

Problem 2: Database Growing Too Fast โŒ

Running out of disk space?

# Check database size
mysql -u root -p -e "SELECT table_schema, SUM(data_length + index_length) / 1024 / 1024 AS 'Size (MB)' FROM information_schema.tables WHERE table_schema = 'zabbix' GROUP BY table_schema;"

# Housekeeping settings in Zabbix
# Administration -> Housekeeping
# Enable override for items
# History: 7 days
# Trends: 365 days

# Manual cleanup
mysql -u zabbix -p zabbix << EOF
DELETE FROM history WHERE clock < UNIX_TIMESTAMP(NOW() - INTERVAL 30 DAY);
DELETE FROM history_uint WHERE clock < UNIX_TIMESTAMP(NOW() - INTERVAL 30 DAY);
OPTIMIZE TABLE history;
OPTIMIZE TABLE history_uint;
EOF

Problem 3: False Alerts โŒ

Too many unnecessary alerts?

# Tune trigger thresholds
# In Zabbix Web UI:
# Configuration -> Hosts -> Triggers

# Add dependencies
# Trigger depends on: Host availability

# Use hysteresis
# Problem: CPU > 80%
# Recovery: CPU < 70%

# Maintenance windows
# Configuration -> Maintenance
# Add maintenance periods for planned work

Problem 4: Slow Web Interface โŒ

Dashboard loading slowly?

# Increase PHP memory
sudo nano /etc/php-fpm.d/zabbix.conf
php_value[memory_limit] = 256M

# Optimize MySQL
sudo nano /etc/my.cnf
[mysqld]
innodb_buffer_pool_size = 2G
innodb_log_file_size = 256M

# Increase Zabbix cache
sudo nano /etc/zabbix/zabbix_server.conf
CacheSize=512M
HistoryCacheSize=256M

sudo systemctl restart zabbix-server php-fpm mariadb

๐Ÿ“‹ Simple Commands Summary

TaskCommand
๐Ÿ” Check serversystemctl status zabbix-server
๐Ÿ” Check agentsystemctl status zabbix-agent
๐Ÿ“Š Test itemzabbix_agentd -t item.key
๐Ÿ“ค Send valuezabbix_sender -z server -s host -k key -o value
๐Ÿ“ Server logtail -f /var/log/zabbix/zabbix_server.log
๐Ÿ“ Agent logtail -f /var/log/zabbix/zabbix_agentd.log
๐Ÿ”„ Restart allsystemctl restart zabbix-server zabbix-agent
๐ŸŒ Web UIhttp://server-ip

๐Ÿ’ก Tips for Success

  1. Start Small ๐ŸŽฏ - Monitor critical items first
  2. Use Templates ๐Ÿ“‹ - Donโ€™t reinvent the wheel
  3. Baseline First ๐Ÿ“Š - Know whatโ€™s normal
  4. Tune Alerts ๐Ÿ”” - Quality over quantity
  5. Document Triggers ๐Ÿ“ - Why each alert matters
  6. Test Recovery ๐Ÿงช - Alerts must be actionable

Pro tip: Create a โ€œnoiseโ€ dashboard for alerts that fire too often. Review weekly and tune thresholds! ๐ŸŽ›๏ธ

๐Ÿ† What You Learned

Youโ€™re now a monitoring master! You can:

  • โœ… Install and configure Zabbix
  • โœ… Deploy agents everywhere
  • โœ… Create custom monitoring
  • โœ… Build dashboards
  • โœ… Configure smart alerts
  • โœ… Implement auto-remediation
  • โœ… Predict problems

๐ŸŽฏ Why This Matters

Proper monitoring provides:

  • ๐Ÿ‘๏ธ Complete visibility
  • ๐Ÿšจ Early warning system
  • ๐Ÿ“Š Performance insights
  • ๐Ÿ”ฎ Predictive capabilities
  • ๐Ÿ’ฐ Cost optimization
  • ๐Ÿ˜ด Better sleep

Last Christmas, our e-commerce site stayed up during 10x normal traffic. Zabbix auto-scaled our infrastructure before we even noticed the spike. Sales record broken, zero downtime! ๐ŸŽ„

Remember: If youโ€™re not monitoring it, itโ€™s already broken! ๐Ÿ‘๏ธ

Happy monitoring! May your alerts be meaningful and your dashboards green! ๐Ÿ“Šโœจ