📊 AlmaLinux Monitoring: Complete Prometheus & Grafana Guide for Real-Time Insights

Hey there, monitoring maestro! 🎉 Ready to transform your AlmaLinux system into an all-seeing eye that catches problems before they become disasters? Today we’re building a complete monitoring stack with Prometheus and Grafana that will give you superpowers to see everything happening in your infrastructure! 🚀

Whether you’re monitoring a single server or an entire fleet, this guide will turn your AlmaLinux system into a monitoring powerhouse that provides real-time insights and beautiful visualizations! 💪

🤔 Why is Monitoring Important?

Imagine driving a car with no dashboard – no speedometer, no fuel gauge, no warning lights! 😱 That’s what running servers without monitoring is like. You’re flying blind until something crashes!

Here’s why Prometheus & Grafana on AlmaLinux is absolutely essential:

📈 Real-Time Insights - See what’s happening right now, not yesterday
🚨 Proactive Alerting - Fix problems before users notice them
📊 Beautiful Dashboards - Visualize complex data at a glance
🔍 Historical Analysis - Understand trends and patterns over time
⚡ Performance Optimization - Identify bottlenecks and inefficiencies
💾 Capacity Planning - Know when you’ll need more resources
🛡️ Security Monitoring - Detect suspicious activities immediately
📱 24/7 Awareness - Get alerts on your phone when things go wrong

🎯 What You Need

Before we start building your monitoring empire, let’s make sure you have everything ready:

✅ AlmaLinux 9.x system (with 2+ GB RAM) ✅ Root or sudo access for installation ✅ Internet connection for downloading packages ✅ Basic Linux knowledge (files, services, networking) ✅ Systems to monitor (we’ll start with localhost) ✅ Web browser for accessing dashboards ✅ Coffee ready ☕ (this is going to be fun!) ✅ Excitement about data and visualizations! 📊

📝 Step 1: Install Prometheus Server

Let’s start by setting up Prometheus to collect all those juicy metrics! 🎯

# Create prometheus user
sudo useradd --no-create-home --shell /bin/false prometheus

# Create directories for Prometheus
sudo mkdir /etc/prometheus
sudo mkdir /var/lib/prometheus
sudo chown prometheus:prometheus /var/lib/prometheus

# Download Prometheus (check for latest version at prometheus.io)
cd /tmp
wget https://github.com/prometheus/prometheus/releases/download/v2.47.0/prometheus-2.47.0.linux-amd64.tar.gz

# Extract the archive
tar xvf prometheus-2.47.0.linux-amd64.tar.gz
cd prometheus-2.47.0.linux-amd64/

# Copy binaries to system locations
sudo cp prometheus /usr/local/bin/
sudo cp promtool /usr/local/bin/

# Copy configuration files
sudo cp -r consoles/ /etc/prometheus/
sudo cp -r console_libraries/ /etc/prometheus/

# Set ownership
sudo chown -R prometheus:prometheus /etc/prometheus
sudo chown prometheus:prometheus /usr/local/bin/prometheus
sudo chown prometheus:prometheus /usr/local/bin/promtool

# Verify installation
prometheus --version

Expected output:

prometheus, version 2.47.0 (branch: HEAD, revision: xxx)
  build user:       root@xxx
  build date:       20XX-XX-XX
  go version:       go1.20.X

Great! Prometheus is installed! 🎉

🔧 Step 2: Configure Prometheus

Now let’s configure Prometheus to start collecting metrics:

# Create Prometheus configuration file
sudo tee /etc/prometheus/prometheus.yml << 'EOF'
# Global configuration
global:
  scrape_interval: 15s       # How often to scrape targets
  evaluation_interval: 15s   # How often to evaluate rules
  scrape_timeout: 10s        # Timeout for scraping

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          - localhost:9093

# Load rules once and periodically evaluate them
rule_files:
  - "/etc/prometheus/rules/*.yml"

# Scrape configurations
scrape_configs:
  # Scrape Prometheus itself
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']
        labels:
          instance: 'prometheus-server'

  # Scrape node exporter for system metrics
  - job_name: 'node-exporter'
    static_configs:
      - targets: ['localhost:9100']
        labels:
          instance: 'almalinux-host'

  # Scrape application metrics
  - job_name: 'applications'
    static_configs:
      - targets: ['localhost:8080']
        labels:
          app: 'my-app'
          env: 'production'

  # Scrape Docker containers (if Docker is installed)
  - job_name: 'docker'
    static_configs:
      - targets: ['localhost:9323']
        labels:
          instance: 'docker-host'
EOF

# Create systemd service file for Prometheus
sudo tee /etc/systemd/system/prometheus.service << 'EOF'
[Unit]
Description=Prometheus Monitoring System
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
    --config.file /etc/prometheus/prometheus.yml \
    --storage.tsdb.path /var/lib/prometheus/ \
    --storage.tsdb.retention.time=30d \
    --web.console.templates=/etc/prometheus/consoles \
    --web.console.libraries=/etc/prometheus/console_libraries \
    --web.enable-lifecycle

Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
EOF

# Reload systemd and start Prometheus
sudo systemctl daemon-reload
sudo systemctl enable prometheus
sudo systemctl start prometheus

# Check status
sudo systemctl status prometheus

Perfect! Prometheus is running on port 9090! 🌟

🌟 Step 3: Install Node Exporter for System Metrics

Let’s add Node Exporter to collect detailed system metrics:

# Download Node Exporter
cd /tmp
wget https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz

# Extract and install
tar xvf node_exporter-1.6.1.linux-amd64.tar.gz
sudo cp node_exporter-1.6.1.linux-amd64/node_exporter /usr/local/bin/
sudo chown prometheus:prometheus /usr/local/bin/node_exporter

# Create systemd service for Node Exporter
sudo tee /etc/systemd/system/node_exporter.service << 'EOF'
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/node_exporter \
    --collector.filesystem.mount-points-exclude="^/(dev|proc|sys|run)($|/)" \
    --collector.netclass.ignored-devices="^(veth.*|br.*|docker.*|virbr.*)" \
    --collector.systemd \
    --collector.processes

Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
EOF

# Start Node Exporter
sudo systemctl daemon-reload
sudo systemctl enable node_exporter
sudo systemctl start node_exporter

# Verify it's working
curl http://localhost:9100/metrics | grep "node_"

Excellent! Node Exporter is collecting system metrics! 📈

✅ Step 4: Install and Configure Grafana

Now for the fun part – beautiful dashboards with Grafana!

# Add Grafana repository
sudo tee /etc/yum.repos.d/grafana.repo << 'EOF'
[grafana]
name=grafana
baseurl=https://rpm.grafana.com
repo_gpgcheck=1
enabled=1
gpgcheck=1
gpgkey=https://rpm.grafana.com/gpg.key
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
EOF

# Install Grafana
sudo dnf install -y grafana

# Start and enable Grafana
sudo systemctl enable grafana-server
sudo systemctl start grafana-server

# Check status
sudo systemctl status grafana-server

# Configure firewall
sudo firewall-cmd --permanent --add-port=3000/tcp
sudo firewall-cmd --permanent --add-port=9090/tcp
sudo firewall-cmd --permanent --add-port=9100/tcp
sudo firewall-cmd --reload

Access Grafana at http://your-server:3000 (default login: admin/admin)

Now let’s create some awesome dashboards:

# Create dashboard configuration
cat > /tmp/system-dashboard.json << 'EOF'
{
  "dashboard": {
    "title": "System Monitoring Dashboard",
    "panels": [
      {
        "gridPos": {"h": 8, "w": 12, "x": 0, "y": 0},
        "title": "CPU Usage",
        "targets": [
          {
            "expr": "100 - (avg(irate(node_cpu_seconds_total{mode=\"idle\"}[5m])) * 100)",
            "legendFormat": "CPU Usage %"
          }
        ],
        "type": "graph"
      },
      {
        "gridPos": {"h": 8, "w": 12, "x": 12, "y": 0},
        "title": "Memory Usage",
        "targets": [
          {
            "expr": "(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100",
            "legendFormat": "Memory Usage %"
          }
        ],
        "type": "graph"
      },
      {
        "gridPos": {"h": 8, "w": 12, "x": 0, "y": 8},
        "title": "Disk I/O",
        "targets": [
          {
            "expr": "rate(node_disk_io_time_seconds_total[5m]) * 100",
            "legendFormat": "{{device}}"
          }
        ],
        "type": "graph"
      },
      {
        "gridPos": {"h": 8, "w": 12, "x": 12, "y": 8},
        "title": "Network Traffic",
        "targets": [
          {
            "expr": "rate(node_network_receive_bytes_total[5m])",
            "legendFormat": "RX {{device}}"
          },
          {
            "expr": "rate(node_network_transmit_bytes_total[5m])",
            "legendFormat": "TX {{device}}"
          }
        ],
        "type": "graph"
      }
    ]
  }
}
EOF

# Import dashboard via API (after setting up Grafana)
# You'll need to get an API key from Grafana UI first

Fantastic! Your dashboards are ready! 🎯

🎮 Quick Examples

Example 1: Custom Application Metrics

# Create a Python app with Prometheus metrics
cat > app_with_metrics.py << 'EOF'
#!/usr/bin/env python3
from prometheus_client import start_http_server, Counter, Histogram, Gauge
import time
import random

# Define metrics
request_count = Counter('app_requests_total', 'Total requests', ['method', 'endpoint'])
request_duration = Histogram('app_request_duration_seconds', 'Request duration', ['method', 'endpoint'])
active_users = Gauge('app_active_users', 'Number of active users')
error_count = Counter('app_errors_total', 'Total errors', ['type'])

def process_request():
    """Simulate processing a request"""
    method = random.choice(['GET', 'POST', 'PUT', 'DELETE'])
    endpoint = random.choice(['/api/users', '/api/products', '/api/orders'])

    # Track request
    request_count.labels(method=method, endpoint=endpoint).inc()

    # Simulate processing time
    with request_duration.labels(method=method, endpoint=endpoint).time():
        time.sleep(random.uniform(0.1, 1.0))

    # Randomly generate errors
    if random.random() < 0.1:
        error_count.labels(type='server_error').inc()

    # Update active users
    active_users.set(random.randint(10, 100))

if __name__ == '__main__':
    # Start Prometheus metrics server
    start_http_server(8000)
    print("🎯 Metrics server started on port 8000")

    # Continuously generate metrics
    while True:
        process_request()
        time.sleep(1)
EOF

# Install Python Prometheus client
pip3 install prometheus-client

# Run the application
python3 app_with_metrics.py &

This exposes custom application metrics! 📊

Example 2: Alert Rules Configuration

# Create alert rules
sudo mkdir -p /etc/prometheus/rules
sudo tee /etc/prometheus/rules/alerts.yml << 'EOF'
groups:
  - name: system_alerts
    interval: 30s
    rules:
      - alert: HighCPUUsage
        expr: 100 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage detected"
          description: "CPU usage is above 80% (current value: {{ $value }}%)"

      - alert: HighMemoryUsage
        expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 90
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "High memory usage detected"
          description: "Memory usage is above 90% (current value: {{ $value }}%)"

      - alert: DiskSpaceLow
        expr: (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 < 10
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Low disk space"
          description: "Less than 10% disk space remaining on root partition"

      - alert: ServiceDown
        expr: up == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Service is down"
          description: "{{ $labels.job }} on {{ $labels.instance }} is down"

      - alert: HighNetworkTraffic
        expr: rate(node_network_receive_bytes_total[5m]) + rate(node_network_transmit_bytes_total[5m]) > 100000000
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High network traffic"
          description: "Network traffic exceeds 100MB/s"
EOF

# Reload Prometheus configuration
sudo systemctl reload prometheus

This sets up comprehensive alerting! 🚨

Example 3: Advanced Grafana Dashboard

# Create advanced monitoring script
cat > setup_advanced_dashboard.sh << 'EOF'
#!/bin/bash
# Setup advanced Grafana dashboard

# Set Grafana API credentials
GRAFANA_URL="http://localhost:3000"
GRAFANA_USER="admin"
GRAFANA_PASS="admin"

# Create data source for Prometheus
curl -X POST \
  -H "Content-Type: application/json" \
  -u "${GRAFANA_USER}:${GRAFANA_PASS}" \
  -d '{
    "name": "Prometheus",
    "type": "prometheus",
    "url": "http://localhost:9090",
    "access": "proxy",
    "isDefault": true
  }' \
  "${GRAFANA_URL}/api/datasources"

# Create folder for dashboards
curl -X POST \
  -H "Content-Type: application/json" \
  -u "${GRAFANA_USER}:${GRAFANA_PASS}" \
  -d '{
    "title": "System Monitoring"
  }' \
  "${GRAFANA_URL}/api/folders"

# Import comprehensive dashboard
cat > comprehensive_dashboard.json << 'JSON'
{
  "dashboard": {
    "title": "Comprehensive System Monitoring",
    "tags": ["system", "monitoring"],
    "timezone": "browser",
    "panels": [
      {
        "title": "System Overview",
        "type": "stat",
        "targets": [
          {"expr": "up", "legendFormat": "Services Up"}
        ]
      },
      {
        "title": "CPU by Core",
        "type": "graph",
        "targets": [
          {"expr": "irate(node_cpu_seconds_total[5m])", "legendFormat": "CPU {{cpu}} - {{mode}}"}
        ]
      },
      {
        "title": "Memory Breakdown",
        "type": "piechart",
        "targets": [
          {"expr": "node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes", "legendFormat": "Used"},
          {"expr": "node_memory_MemAvailable_bytes", "legendFormat": "Available"}
        ]
      },
      {
        "title": "Disk Usage by Mount",
        "type": "bargauge",
        "targets": [
          {"expr": "(node_filesystem_size_bytes - node_filesystem_avail_bytes) / node_filesystem_size_bytes * 100", "legendFormat": "{{mountpoint}}"}
        ]
      }
    ]
  },
  "overwrite": true
}
JSON

# Import the dashboard
curl -X POST \
  -H "Content-Type: application/json" \
  -u "${GRAFANA_USER}:${GRAFANA_PASS}" \
  -d @comprehensive_dashboard.json \
  "${GRAFANA_URL}/api/dashboards/db"

echo "✅ Advanced dashboard created successfully!"
EOF

chmod +x setup_advanced_dashboard.sh
./setup_advanced_dashboard.sh

This creates professional-grade dashboards! 📈

🚨 Fix Common Problems

Problem 1: Prometheus Can’t Scrape Targets

Symptoms: Targets show as “DOWN” in Prometheus

# Check target status
curl http://localhost:9090/targets

# Debug connectivity
curl http://localhost:9100/metrics  # Node exporter
curl http://localhost:3000/metrics  # Grafana

# Check firewall rules
sudo firewall-cmd --list-all

# Fix SELinux if needed
sudo setenforce 0  # Temporary
# Or create proper SELinux policy

# Check Prometheus logs
sudo journalctl -u prometheus -n 50 --no-pager

# Verify configuration
promtool check config /etc/prometheus/prometheus.yml

Problem 2: High Memory Usage by Prometheus

Symptoms: Prometheus consuming too much RAM

# Check current memory usage
ps aux | grep prometheus

# Optimize Prometheus configuration
sudo tee -a /etc/systemd/system/prometheus.service << 'EOF'
Environment="GOMAXPROCS=2"
Environment="GOMEMLIMIT=1GiB"
EOF

# Reduce retention time
sudo sed -i 's/retention.time=30d/retention.time=7d/' /etc/systemd/system/prometheus.service

# Clean up old data
sudo systemctl stop prometheus
sudo rm -rf /var/lib/prometheus/data/*
sudo systemctl start prometheus

# Enable compression
echo "storage.tsdb.compression: snappy" >> /etc/prometheus/prometheus.yml

Problem 3: Grafana Dashboards Not Loading

Symptoms: Dashboards show “No Data” or errors

# Check Grafana data source configuration
curl -u admin:admin http://localhost:3000/api/datasources

# Test Prometheus connectivity from Grafana
curl -X POST \
  -H "Content-Type: application/json" \
  -u admin:admin \
  -d '{"datasourceId": 1}' \
  http://localhost:3000/api/datasources/proxy/1/api/v1/query?query=up

# Check time synchronization
timedatectl status
sudo systemctl restart chronyd

# Restart services
sudo systemctl restart prometheus
sudo systemctl restart grafana-server

# Check logs
sudo journalctl -u grafana-server -n 50 --no-pager

Problem 4: Alerts Not Firing

Symptoms: Alert conditions met but no notifications

# Check alert rules syntax
promtool check rules /etc/prometheus/rules/alerts.yml

# Verify alerts are loaded
curl http://localhost:9090/api/v1/rules | jq .

# Check alert status
curl http://localhost:9090/api/v1/alerts | jq .

# Test alert manager (if configured)
amtool alert add alertname=test severity=critical

# Force reload rules
sudo kill -HUP $(pidof prometheus)

# Check Prometheus logs for rule evaluation
sudo journalctl -u prometheus | grep -i "rule\|alert"

📋 Simple Commands Summary

Command	Purpose
`sudo systemctl status prometheus`	Check Prometheus status
`curl http://localhost:9090/metrics`	View Prometheus metrics
`curl http://localhost:9100/metrics`	View Node Exporter metrics
`promtool check config /etc/prometheus/prometheus.yml`	Validate Prometheus config
`sudo systemctl restart grafana-server`	Restart Grafana
`curl -u admin:admin http://localhost:3000/api/datasources`	Check Grafana data sources
`sudo journalctl -u prometheus -f`	View Prometheus logs
`curl http://localhost:9090/targets`	Check scrape targets
`promtool query instant http://localhost:9090 'up'`	Query Prometheus
`sudo firewall-cmd --add-port=9090/tcp --permanent`	Open Prometheus port

💡 Tips for Success

🎯 Start Small: Begin with basic metrics before adding complexity

🔍 Label Everything: Use descriptive labels for better organization

📊 Dashboard Design: Keep dashboards focused and avoid clutter

🛡️ Security First: Use authentication and HTTPS in production

🚀 Performance Tune: Adjust scrape intervals based on your needs

📝 Document Queries: Save important PromQL queries for reuse

🔄 Regular Maintenance: Clean up old data and optimize storage

⚡ Alert Wisely: Avoid alert fatigue with well-tuned thresholds

🏆 What You Learned

Congratulations! You’ve successfully mastered monitoring with Prometheus and Grafana on AlmaLinux! 🎉

✅ Installed Prometheus for metrics collection ✅ Configured Node Exporter for system metrics ✅ Set up Grafana for beautiful visualizations ✅ Created custom dashboards for your specific needs ✅ Implemented alerting rules for proactive monitoring ✅ Added application metrics with custom exporters ✅ Troubleshot common issues and optimized performance ✅ Built comprehensive monitoring infrastructure

🎯 Why This Matters

Monitoring is the foundation of reliable infrastructure! 🌟 With your AlmaLinux monitoring stack, you now have:

Complete visibility into system and application performance
Early warning system for potential problems
Data-driven insights for optimization decisions
Professional dashboards that impress stakeholders
Foundation for SRE practices and reliability engineering

You’re now equipped to run production systems with confidence, knowing you’ll spot issues before they impact users! Your monitoring skills put you in the league of professional DevOps engineers and SREs! 🚀

Keep monitoring, keep improving, and remember – what gets measured gets managed! You’ve got this! ⭐🙌

📊 AlmaLinux Monitoring: Complete Prometheus & Grafana Guide for Real-Time Insights

Table of Contents

📊 AlmaLinux Monitoring: Complete Prometheus & Grafana Guide for Real-Time Insights

🤔 Why is Monitoring Important?

🎯 What You Need

📝 Step 1: Install Prometheus Server

🔧 Step 2: Configure Prometheus

🌟 Step 3: Install Node Exporter for System Metrics

✅ Step 4: Install and Configure Grafana

🎮 Quick Examples

Example 1: Custom Application Metrics

Example 2: Alert Rules Configuration

Example 3: Advanced Grafana Dashboard

🚨 Fix Common Problems

Problem 1: Prometheus Can’t Scrape Targets

Problem 2: High Memory Usage by Prometheus

Problem 3: Grafana Dashboards Not Loading

Problem 4: Alerts Not Firing

📋 Simple Commands Summary

💡 Tips for Success

🏆 What You Learned

🎯 Why This Matters

Share this article

📊 AlmaLinux Monitoring: Complete Prometheus & Grafana Guide for Real-Time Insights

Table of Contents

📊 AlmaLinux Monitoring: Complete Prometheus & Grafana Guide for Real-Time Insights

🤔 Why is Monitoring Important?

🎯 What You Need

📝 Step 1: Install Prometheus Server

🔧 Step 2: Configure Prometheus

🌟 Step 3: Install Node Exporter for System Metrics

✅ Step 4: Install and Configure Grafana

🎮 Quick Examples

Example 1: Custom Application Metrics

Example 2: Alert Rules Configuration

Example 3: Advanced Grafana Dashboard

🚨 Fix Common Problems

Problem 1: Prometheus Can’t Scrape Targets

Problem 2: High Memory Usage by Prometheus

Problem 3: Grafana Dashboards Not Loading

Problem 4: Alerts Not Firing

📋 Simple Commands Summary

💡 Tips for Success

🏆 What You Learned

🎯 Why This Matters

Share this article

Related Articles

📊 Building Beautiful Monitoring Dashboards with Grafana on AlmaLinux: Visualize Your Data Like a Pro

📊 Thanos Metrics Setup on AlmaLinux 9: Complete Long-term Prometheus Storage Guide

📊 Monitoring with Zabbix on AlmaLinux: See Everything, Miss Nothing

Scan QR Code