+
cosmos
+
+
gradle
nomad
stencil
+
+
+
?
wasm
+
+
::
+
gatsby
+
+
gatsby
+
+
+
+
+
+
asm
+
=
dart
bitbucket
haskell
+
+
smtp
objc
weaviate
+
0x
linux
+
โˆฉ
+
goland
+
dns
โІ
couchdb
+
+
rubymine
+
+
influxdb
+
+
elixir
โˆš
<-
+
+
+
f#
+
vb
+
+
+
$
+
arch
c
+
+
xcode
+
mvn
+
+
+
+
actix
+
eclipse
+
&&
+
+
+
solid
Back to Blog
๐Ÿ“Š AlmaLinux Monitoring: Complete Prometheus & Grafana Guide for Real-Time Insights
AlmaLinux Prometheus Grafana

๐Ÿ“Š AlmaLinux Monitoring: Complete Prometheus & Grafana Guide for Real-Time Insights

Published Sep 18, 2025

Master system monitoring on AlmaLinux! Learn Prometheus metrics collection, Grafana dashboards, alerting, and performance optimization. Complete guide with real examples and best practices.

57 min read
0 views
Table of Contents

๐Ÿ“Š AlmaLinux Monitoring: Complete Prometheus & Grafana Guide for Real-Time Insights

Hey there, monitoring maestro! ๐ŸŽ‰ Ready to transform your AlmaLinux system into an all-seeing eye that catches problems before they become disasters? Today weโ€™re building a complete monitoring stack with Prometheus and Grafana that will give you superpowers to see everything happening in your infrastructure! ๐Ÿš€

Whether youโ€™re monitoring a single server or an entire fleet, this guide will turn your AlmaLinux system into a monitoring powerhouse that provides real-time insights and beautiful visualizations! ๐Ÿ’ช

๐Ÿค” Why is Monitoring Important?

Imagine driving a car with no dashboard โ€“ no speedometer, no fuel gauge, no warning lights! ๐Ÿ˜ฑ Thatโ€™s what running servers without monitoring is like. Youโ€™re flying blind until something crashes!

Hereโ€™s why Prometheus & Grafana on AlmaLinux is absolutely essential:

  • ๐Ÿ“ˆ Real-Time Insights - See whatโ€™s happening right now, not yesterday
  • ๐Ÿšจ Proactive Alerting - Fix problems before users notice them
  • ๐Ÿ“Š Beautiful Dashboards - Visualize complex data at a glance
  • ๐Ÿ” Historical Analysis - Understand trends and patterns over time
  • โšก Performance Optimization - Identify bottlenecks and inefficiencies
  • ๐Ÿ’พ Capacity Planning - Know when youโ€™ll need more resources
  • ๐Ÿ›ก๏ธ Security Monitoring - Detect suspicious activities immediately
  • ๐Ÿ“ฑ 24/7 Awareness - Get alerts on your phone when things go wrong

๐ŸŽฏ What You Need

Before we start building your monitoring empire, letโ€™s make sure you have everything ready:

โœ… AlmaLinux 9.x system (with 2+ GB RAM) โœ… Root or sudo access for installation โœ… Internet connection for downloading packages โœ… Basic Linux knowledge (files, services, networking) โœ… Systems to monitor (weโ€™ll start with localhost) โœ… Web browser for accessing dashboards โœ… Coffee ready โ˜• (this is going to be fun!) โœ… Excitement about data and visualizations! ๐Ÿ“Š

๐Ÿ“ Step 1: Install Prometheus Server

Letโ€™s start by setting up Prometheus to collect all those juicy metrics! ๐ŸŽฏ

# Create prometheus user
sudo useradd --no-create-home --shell /bin/false prometheus

# Create directories for Prometheus
sudo mkdir /etc/prometheus
sudo mkdir /var/lib/prometheus
sudo chown prometheus:prometheus /var/lib/prometheus

# Download Prometheus (check for latest version at prometheus.io)
cd /tmp
wget https://github.com/prometheus/prometheus/releases/download/v2.47.0/prometheus-2.47.0.linux-amd64.tar.gz

# Extract the archive
tar xvf prometheus-2.47.0.linux-amd64.tar.gz
cd prometheus-2.47.0.linux-amd64/

# Copy binaries to system locations
sudo cp prometheus /usr/local/bin/
sudo cp promtool /usr/local/bin/

# Copy configuration files
sudo cp -r consoles/ /etc/prometheus/
sudo cp -r console_libraries/ /etc/prometheus/

# Set ownership
sudo chown -R prometheus:prometheus /etc/prometheus
sudo chown prometheus:prometheus /usr/local/bin/prometheus
sudo chown prometheus:prometheus /usr/local/bin/promtool

# Verify installation
prometheus --version

Expected output:

prometheus, version 2.47.0 (branch: HEAD, revision: xxx)
  build user:       root@xxx
  build date:       20XX-XX-XX
  go version:       go1.20.X

Great! Prometheus is installed! ๐ŸŽ‰

๐Ÿ”ง Step 2: Configure Prometheus

Now letโ€™s configure Prometheus to start collecting metrics:

# Create Prometheus configuration file
sudo tee /etc/prometheus/prometheus.yml << 'EOF'
# Global configuration
global:
  scrape_interval: 15s       # How often to scrape targets
  evaluation_interval: 15s   # How often to evaluate rules
  scrape_timeout: 10s        # Timeout for scraping

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          - localhost:9093

# Load rules once and periodically evaluate them
rule_files:
  - "/etc/prometheus/rules/*.yml"

# Scrape configurations
scrape_configs:
  # Scrape Prometheus itself
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']
        labels:
          instance: 'prometheus-server'

  # Scrape node exporter for system metrics
  - job_name: 'node-exporter'
    static_configs:
      - targets: ['localhost:9100']
        labels:
          instance: 'almalinux-host'

  # Scrape application metrics
  - job_name: 'applications'
    static_configs:
      - targets: ['localhost:8080']
        labels:
          app: 'my-app'
          env: 'production'

  # Scrape Docker containers (if Docker is installed)
  - job_name: 'docker'
    static_configs:
      - targets: ['localhost:9323']
        labels:
          instance: 'docker-host'
EOF

# Create systemd service file for Prometheus
sudo tee /etc/systemd/system/prometheus.service << 'EOF'
[Unit]
Description=Prometheus Monitoring System
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
    --config.file /etc/prometheus/prometheus.yml \
    --storage.tsdb.path /var/lib/prometheus/ \
    --storage.tsdb.retention.time=30d \
    --web.console.templates=/etc/prometheus/consoles \
    --web.console.libraries=/etc/prometheus/console_libraries \
    --web.enable-lifecycle

Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
EOF

# Reload systemd and start Prometheus
sudo systemctl daemon-reload
sudo systemctl enable prometheus
sudo systemctl start prometheus

# Check status
sudo systemctl status prometheus

Perfect! Prometheus is running on port 9090! ๐ŸŒŸ

๐ŸŒŸ Step 3: Install Node Exporter for System Metrics

Letโ€™s add Node Exporter to collect detailed system metrics:

# Download Node Exporter
cd /tmp
wget https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz

# Extract and install
tar xvf node_exporter-1.6.1.linux-amd64.tar.gz
sudo cp node_exporter-1.6.1.linux-amd64/node_exporter /usr/local/bin/
sudo chown prometheus:prometheus /usr/local/bin/node_exporter

# Create systemd service for Node Exporter
sudo tee /etc/systemd/system/node_exporter.service << 'EOF'
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/node_exporter \
    --collector.filesystem.mount-points-exclude="^/(dev|proc|sys|run)($|/)" \
    --collector.netclass.ignored-devices="^(veth.*|br.*|docker.*|virbr.*)" \
    --collector.systemd \
    --collector.processes

Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
EOF

# Start Node Exporter
sudo systemctl daemon-reload
sudo systemctl enable node_exporter
sudo systemctl start node_exporter

# Verify it's working
curl http://localhost:9100/metrics | grep "node_"

Excellent! Node Exporter is collecting system metrics! ๐Ÿ“ˆ

โœ… Step 4: Install and Configure Grafana

Now for the fun part โ€“ beautiful dashboards with Grafana!

# Add Grafana repository
sudo tee /etc/yum.repos.d/grafana.repo << 'EOF'
[grafana]
name=grafana
baseurl=https://rpm.grafana.com
repo_gpgcheck=1
enabled=1
gpgcheck=1
gpgkey=https://rpm.grafana.com/gpg.key
sslverify=1
sslcacert=/etc/pki/tls/certs/ca-bundle.crt
EOF

# Install Grafana
sudo dnf install -y grafana

# Start and enable Grafana
sudo systemctl enable grafana-server
sudo systemctl start grafana-server

# Check status
sudo systemctl status grafana-server

# Configure firewall
sudo firewall-cmd --permanent --add-port=3000/tcp
sudo firewall-cmd --permanent --add-port=9090/tcp
sudo firewall-cmd --permanent --add-port=9100/tcp
sudo firewall-cmd --reload

Access Grafana at http://your-server:3000 (default login: admin/admin)

Now letโ€™s create some awesome dashboards:

# Create dashboard configuration
cat > /tmp/system-dashboard.json << 'EOF'
{
  "dashboard": {
    "title": "System Monitoring Dashboard",
    "panels": [
      {
        "gridPos": {"h": 8, "w": 12, "x": 0, "y": 0},
        "title": "CPU Usage",
        "targets": [
          {
            "expr": "100 - (avg(irate(node_cpu_seconds_total{mode=\"idle\"}[5m])) * 100)",
            "legendFormat": "CPU Usage %"
          }
        ],
        "type": "graph"
      },
      {
        "gridPos": {"h": 8, "w": 12, "x": 12, "y": 0},
        "title": "Memory Usage",
        "targets": [
          {
            "expr": "(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100",
            "legendFormat": "Memory Usage %"
          }
        ],
        "type": "graph"
      },
      {
        "gridPos": {"h": 8, "w": 12, "x": 0, "y": 8},
        "title": "Disk I/O",
        "targets": [
          {
            "expr": "rate(node_disk_io_time_seconds_total[5m]) * 100",
            "legendFormat": "{{device}}"
          }
        ],
        "type": "graph"
      },
      {
        "gridPos": {"h": 8, "w": 12, "x": 12, "y": 8},
        "title": "Network Traffic",
        "targets": [
          {
            "expr": "rate(node_network_receive_bytes_total[5m])",
            "legendFormat": "RX {{device}}"
          },
          {
            "expr": "rate(node_network_transmit_bytes_total[5m])",
            "legendFormat": "TX {{device}}"
          }
        ],
        "type": "graph"
      }
    ]
  }
}
EOF

# Import dashboard via API (after setting up Grafana)
# You'll need to get an API key from Grafana UI first

Fantastic! Your dashboards are ready! ๐ŸŽฏ

๐ŸŽฎ Quick Examples

Example 1: Custom Application Metrics

# Create a Python app with Prometheus metrics
cat > app_with_metrics.py << 'EOF'
#!/usr/bin/env python3
from prometheus_client import start_http_server, Counter, Histogram, Gauge
import time
import random

# Define metrics
request_count = Counter('app_requests_total', 'Total requests', ['method', 'endpoint'])
request_duration = Histogram('app_request_duration_seconds', 'Request duration', ['method', 'endpoint'])
active_users = Gauge('app_active_users', 'Number of active users')
error_count = Counter('app_errors_total', 'Total errors', ['type'])

def process_request():
    """Simulate processing a request"""
    method = random.choice(['GET', 'POST', 'PUT', 'DELETE'])
    endpoint = random.choice(['/api/users', '/api/products', '/api/orders'])

    # Track request
    request_count.labels(method=method, endpoint=endpoint).inc()

    # Simulate processing time
    with request_duration.labels(method=method, endpoint=endpoint).time():
        time.sleep(random.uniform(0.1, 1.0))

    # Randomly generate errors
    if random.random() < 0.1:
        error_count.labels(type='server_error').inc()

    # Update active users
    active_users.set(random.randint(10, 100))

if __name__ == '__main__':
    # Start Prometheus metrics server
    start_http_server(8000)
    print("๐ŸŽฏ Metrics server started on port 8000")

    # Continuously generate metrics
    while True:
        process_request()
        time.sleep(1)
EOF

# Install Python Prometheus client
pip3 install prometheus-client

# Run the application
python3 app_with_metrics.py &

This exposes custom application metrics! ๐Ÿ“Š

Example 2: Alert Rules Configuration

# Create alert rules
sudo mkdir -p /etc/prometheus/rules
sudo tee /etc/prometheus/rules/alerts.yml << 'EOF'
groups:
  - name: system_alerts
    interval: 30s
    rules:
      - alert: HighCPUUsage
        expr: 100 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage detected"
          description: "CPU usage is above 80% (current value: {{ $value }}%)"

      - alert: HighMemoryUsage
        expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 90
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "High memory usage detected"
          description: "Memory usage is above 90% (current value: {{ $value }}%)"

      - alert: DiskSpaceLow
        expr: (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 < 10
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Low disk space"
          description: "Less than 10% disk space remaining on root partition"

      - alert: ServiceDown
        expr: up == 0
        for: 1m
        labels:
          severity: critical
        annotations:
          summary: "Service is down"
          description: "{{ $labels.job }} on {{ $labels.instance }} is down"

      - alert: HighNetworkTraffic
        expr: rate(node_network_receive_bytes_total[5m]) + rate(node_network_transmit_bytes_total[5m]) > 100000000
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High network traffic"
          description: "Network traffic exceeds 100MB/s"
EOF

# Reload Prometheus configuration
sudo systemctl reload prometheus

This sets up comprehensive alerting! ๐Ÿšจ

Example 3: Advanced Grafana Dashboard

# Create advanced monitoring script
cat > setup_advanced_dashboard.sh << 'EOF'
#!/bin/bash
# Setup advanced Grafana dashboard

# Set Grafana API credentials
GRAFANA_URL="http://localhost:3000"
GRAFANA_USER="admin"
GRAFANA_PASS="admin"

# Create data source for Prometheus
curl -X POST \
  -H "Content-Type: application/json" \
  -u "${GRAFANA_USER}:${GRAFANA_PASS}" \
  -d '{
    "name": "Prometheus",
    "type": "prometheus",
    "url": "http://localhost:9090",
    "access": "proxy",
    "isDefault": true
  }' \
  "${GRAFANA_URL}/api/datasources"

# Create folder for dashboards
curl -X POST \
  -H "Content-Type: application/json" \
  -u "${GRAFANA_USER}:${GRAFANA_PASS}" \
  -d '{
    "title": "System Monitoring"
  }' \
  "${GRAFANA_URL}/api/folders"

# Import comprehensive dashboard
cat > comprehensive_dashboard.json << 'JSON'
{
  "dashboard": {
    "title": "Comprehensive System Monitoring",
    "tags": ["system", "monitoring"],
    "timezone": "browser",
    "panels": [
      {
        "title": "System Overview",
        "type": "stat",
        "targets": [
          {"expr": "up", "legendFormat": "Services Up"}
        ]
      },
      {
        "title": "CPU by Core",
        "type": "graph",
        "targets": [
          {"expr": "irate(node_cpu_seconds_total[5m])", "legendFormat": "CPU {{cpu}} - {{mode}}"}
        ]
      },
      {
        "title": "Memory Breakdown",
        "type": "piechart",
        "targets": [
          {"expr": "node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes", "legendFormat": "Used"},
          {"expr": "node_memory_MemAvailable_bytes", "legendFormat": "Available"}
        ]
      },
      {
        "title": "Disk Usage by Mount",
        "type": "bargauge",
        "targets": [
          {"expr": "(node_filesystem_size_bytes - node_filesystem_avail_bytes) / node_filesystem_size_bytes * 100", "legendFormat": "{{mountpoint}}"}
        ]
      }
    ]
  },
  "overwrite": true
}
JSON

# Import the dashboard
curl -X POST \
  -H "Content-Type: application/json" \
  -u "${GRAFANA_USER}:${GRAFANA_PASS}" \
  -d @comprehensive_dashboard.json \
  "${GRAFANA_URL}/api/dashboards/db"

echo "โœ… Advanced dashboard created successfully!"
EOF

chmod +x setup_advanced_dashboard.sh
./setup_advanced_dashboard.sh

This creates professional-grade dashboards! ๐Ÿ“ˆ

๐Ÿšจ Fix Common Problems

Problem 1: Prometheus Canโ€™t Scrape Targets

Symptoms: Targets show as โ€œDOWNโ€ in Prometheus

# Check target status
curl http://localhost:9090/targets

# Debug connectivity
curl http://localhost:9100/metrics  # Node exporter
curl http://localhost:3000/metrics  # Grafana

# Check firewall rules
sudo firewall-cmd --list-all

# Fix SELinux if needed
sudo setenforce 0  # Temporary
# Or create proper SELinux policy

# Check Prometheus logs
sudo journalctl -u prometheus -n 50 --no-pager

# Verify configuration
promtool check config /etc/prometheus/prometheus.yml

Problem 2: High Memory Usage by Prometheus

Symptoms: Prometheus consuming too much RAM

# Check current memory usage
ps aux | grep prometheus

# Optimize Prometheus configuration
sudo tee -a /etc/systemd/system/prometheus.service << 'EOF'
Environment="GOMAXPROCS=2"
Environment="GOMEMLIMIT=1GiB"
EOF

# Reduce retention time
sudo sed -i 's/retention.time=30d/retention.time=7d/' /etc/systemd/system/prometheus.service

# Clean up old data
sudo systemctl stop prometheus
sudo rm -rf /var/lib/prometheus/data/*
sudo systemctl start prometheus

# Enable compression
echo "storage.tsdb.compression: snappy" >> /etc/prometheus/prometheus.yml

Problem 3: Grafana Dashboards Not Loading

Symptoms: Dashboards show โ€œNo Dataโ€ or errors

# Check Grafana data source configuration
curl -u admin:admin http://localhost:3000/api/datasources

# Test Prometheus connectivity from Grafana
curl -X POST \
  -H "Content-Type: application/json" \
  -u admin:admin \
  -d '{"datasourceId": 1}' \
  http://localhost:3000/api/datasources/proxy/1/api/v1/query?query=up

# Check time synchronization
timedatectl status
sudo systemctl restart chronyd

# Restart services
sudo systemctl restart prometheus
sudo systemctl restart grafana-server

# Check logs
sudo journalctl -u grafana-server -n 50 --no-pager

Problem 4: Alerts Not Firing

Symptoms: Alert conditions met but no notifications

# Check alert rules syntax
promtool check rules /etc/prometheus/rules/alerts.yml

# Verify alerts are loaded
curl http://localhost:9090/api/v1/rules | jq .

# Check alert status
curl http://localhost:9090/api/v1/alerts | jq .

# Test alert manager (if configured)
amtool alert add alertname=test severity=critical

# Force reload rules
sudo kill -HUP $(pidof prometheus)

# Check Prometheus logs for rule evaluation
sudo journalctl -u prometheus | grep -i "rule\|alert"

๐Ÿ“‹ Simple Commands Summary

CommandPurpose
sudo systemctl status prometheusCheck Prometheus status
curl http://localhost:9090/metricsView Prometheus metrics
curl http://localhost:9100/metricsView Node Exporter metrics
promtool check config /etc/prometheus/prometheus.ymlValidate Prometheus config
sudo systemctl restart grafana-serverRestart Grafana
curl -u admin:admin http://localhost:3000/api/datasourcesCheck Grafana data sources
sudo journalctl -u prometheus -fView Prometheus logs
curl http://localhost:9090/targetsCheck scrape targets
promtool query instant http://localhost:9090 'up'Query Prometheus
sudo firewall-cmd --add-port=9090/tcp --permanentOpen Prometheus port

๐Ÿ’ก Tips for Success

๐ŸŽฏ Start Small: Begin with basic metrics before adding complexity

๐Ÿ” Label Everything: Use descriptive labels for better organization

๐Ÿ“Š Dashboard Design: Keep dashboards focused and avoid clutter

๐Ÿ›ก๏ธ Security First: Use authentication and HTTPS in production

๐Ÿš€ Performance Tune: Adjust scrape intervals based on your needs

๐Ÿ“ Document Queries: Save important PromQL queries for reuse

๐Ÿ”„ Regular Maintenance: Clean up old data and optimize storage

โšก Alert Wisely: Avoid alert fatigue with well-tuned thresholds

๐Ÿ† What You Learned

Congratulations! Youโ€™ve successfully mastered monitoring with Prometheus and Grafana on AlmaLinux! ๐ŸŽ‰

โœ… Installed Prometheus for metrics collection โœ… Configured Node Exporter for system metrics โœ… Set up Grafana for beautiful visualizations โœ… Created custom dashboards for your specific needs โœ… Implemented alerting rules for proactive monitoring โœ… Added application metrics with custom exporters โœ… Troubleshot common issues and optimized performance โœ… Built comprehensive monitoring infrastructure

๐ŸŽฏ Why This Matters

Monitoring is the foundation of reliable infrastructure! ๐ŸŒŸ With your AlmaLinux monitoring stack, you now have:

  • Complete visibility into system and application performance
  • Early warning system for potential problems
  • Data-driven insights for optimization decisions
  • Professional dashboards that impress stakeholders
  • Foundation for SRE practices and reliability engineering

Youโ€™re now equipped to run production systems with confidence, knowing youโ€™ll spot issues before they impact users! Your monitoring skills put you in the league of professional DevOps engineers and SREs! ๐Ÿš€

Keep monitoring, keep improving, and remember โ€“ what gets measured gets managed! Youโ€™ve got this! โญ๐Ÿ™Œ