๐ Prometheus Metrics Monitoring on AlmaLinux: Real-Time System Insights
Welcome to the world of modern monitoring! ๐ Ready to see everything happening in your servers like never before? Prometheus is the superhero of monitoring systems - it watches over your infrastructure 24/7 and alerts you before problems happen! Think of it as having X-ray vision for your servers! ๐๏ธโจ
๐ค Why is Prometheus Important?
Prometheus transforms blind server management into informed decision-making! ๐ Hereโs why itโs amazing:
- ๐ Real-Time Metrics - See whatโs happening RIGHT NOW in your servers
- ๐ฏ Powerful Queries - Ask complex questions about your data instantly
- ๐จ Smart Alerting - Get notified BEFORE disasters strike
- ๐ Beautiful Dashboards - Visualize data with Grafana integration
- ๐ Service Discovery - Automatically finds what to monitor
- โก Lightning Fast - Handles millions of metrics without breaking a sweat
Itโs like having a team of experts watching your servers 24/7, but better and cheaper! ๐ฆธโโ๏ธ
๐ฏ What You Need
Before diving into monitoring paradise, ensure you have:
- โ AlmaLinux server (8 or 9)
- โ Root or sudo privileges
- โ At least 2GB RAM (more is better!)
- โ 10GB free disk space
- โ Basic terminal comfort
- โ Excitement to learn! ๐
๐ Step 1: Installing Prometheus - Your Monitoring Engine!
Letโs get Prometheus installed and ready to rock! ๐ธ
First, create a dedicated user for Prometheus (security first!):
# Create prometheus user (no login shell for security)
sudo useradd --no-create-home --shell /bin/false prometheus
# Create directories for Prometheus
sudo mkdir /etc/prometheus
sudo mkdir /var/lib/prometheus
# Set ownership
sudo chown prometheus:prometheus /etc/prometheus
sudo chown prometheus:prometheus /var/lib/prometheus
Now download and install Prometheus:
# Download latest Prometheus (check for newer versions!)
cd /tmp
wget https://github.com/prometheus/prometheus/releases/download/v2.45.0/prometheus-2.45.0.linux-amd64.tar.gz
# Extract the archive
tar -xvf prometheus-2.45.0.linux-amd64.tar.gz
# Move binaries to proper location
sudo cp prometheus-2.45.0.linux-amd64/prometheus /usr/local/bin/
sudo cp prometheus-2.45.0.linux-amd64/promtool /usr/local/bin/
# Set ownership
sudo chown prometheus:prometheus /usr/local/bin/prometheus
sudo chown prometheus:prometheus /usr/local/bin/promtool
# Verify installation
prometheus --version
You should see:
prometheus, version 2.45.0 (branch: HEAD, revision: ...)
Excellent! Prometheus is installed! ๐
๐ง Step 2: Configuring Prometheus - Setting Up Your Monitoring Hub!
Time to tell Prometheus what to monitor! ๐ฏ
Create the main configuration file:
# Create Prometheus configuration
sudo nano /etc/prometheus/prometheus.yml
Add this configuration (Iโll explain each part!):
# Global settings that apply to all jobs
global:
scrape_interval: 15s # How often to collect metrics
evaluation_interval: 15s # How often to evaluate rules
# Labels added to all metrics (useful for multi-cluster setups)
external_labels:
monitor: 'almalinux-monitor'
# Alertmanager configuration (we'll set this up later!)
alerting:
alertmanagers:
- static_configs:
- targets: []
# Load rules once and periodically evaluate them
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# What to monitor - this is where the magic happens!
scrape_configs:
# Monitor Prometheus itself (meta!)
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
# Monitor this Linux server
- job_name: 'node'
static_configs:
- targets: ['localhost:9100']
Set proper permissions:
# Secure the configuration
sudo chown prometheus:prometheus /etc/prometheus/prometheus.yml
sudo chmod 644 /etc/prometheus/prometheus.yml
๐ Step 3: Creating Systemd Service - Making Prometheus Persistent!
Letโs make Prometheus start automatically! ๐
Create a systemd service file:
# Create service file
sudo nano /etc/systemd/system/prometheus.service
Add this configuration:
[Unit]
Description=Prometheus Monitoring System
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
Start Prometheus:
# Reload systemd
sudo systemctl daemon-reload
# Start Prometheus
sudo systemctl start prometheus
# Enable on boot
sudo systemctl enable prometheus
# Check status
sudo systemctl status prometheus
Open firewall port:
# Allow Prometheus web interface
sudo firewall-cmd --permanent --add-port=9090/tcp
sudo firewall-cmd --reload
Visit http://your-server-ip:9090
- Prometheus is alive! ๐
โ Step 4: Installing Node Exporter - Server Metrics Collector!
Node Exporter collects all the juicy server metrics! ๐
# Download Node Exporter
cd /tmp
wget https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz
# Extract
tar -xvf node_exporter-1.6.1.linux-amd64.tar.gz
# Move binary
sudo cp node_exporter-1.6.1.linux-amd64/node_exporter /usr/local/bin/
# Create user
sudo useradd --no-create-home --shell /bin/false node_exporter
# Set ownership
sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter
Create systemd service:
# Create service file
sudo nano /etc/systemd/system/node_exporter.service
Add:
[Unit]
Description=Node Exporter
After=network.target
[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter
[Install]
WantedBy=multi-user.target
Start Node Exporter:
# Start and enable
sudo systemctl daemon-reload
sudo systemctl start node_exporter
sudo systemctl enable node_exporter
# Open firewall port
sudo firewall-cmd --permanent --add-port=9100/tcp
sudo firewall-cmd --reload
Perfect! Now Prometheus can see your server metrics! ๐ฏ
๐จ Step 5: Writing PromQL Queries - Speaking Prometheus Language!
PromQL is Prometheusโs query language - letโs learn some magic spells! โจ
Go to http://your-server-ip:9090
and try these queries:
# CPU usage percentage
100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
# Memory usage percentage
(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100
# Disk usage percentage
100 - ((node_filesystem_avail_bytes{mountpoint="/"} * 100) / node_filesystem_size_bytes{mountpoint="/"})
# Network received bytes per second
rate(node_network_receive_bytes_total[5m])
# System load average
node_load1
Each query tells a story about your server! ๐
๐จ Step 6: Setting Up Alerts - Your Early Warning System!
Letโs create alerts that notify you before disasters! ๐
Create alert rules:
# Create alerts file
sudo nano /etc/prometheus/alerts.yml
Add these smart alerts:
groups:
- name: server_alerts
interval: 30s
rules:
# High CPU usage alert
- alert: HighCPUUsage
expr: 100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage detected"
description: "CPU usage is above 80% (current value: {{ $value }}%)"
# High memory usage alert
- alert: HighMemoryUsage
expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 90
for: 5m
labels:
severity: critical
annotations:
summary: "High memory usage detected"
description: "Memory usage is above 90% (current value: {{ $value }}%)"
# Disk space alert
- alert: LowDiskSpace
expr: 100 - ((node_filesystem_avail_bytes{mountpoint="/"} * 100) / node_filesystem_size_bytes{mountpoint="/"}) > 85
for: 10m
labels:
severity: warning
annotations:
summary: "Low disk space"
description: "Disk usage is above 85% (current value: {{ $value }}%)"
Update Prometheus config to include alerts:
# Edit prometheus.yml
sudo nano /etc/prometheus/prometheus.yml
# Add under rule_files:
rule_files:
- "alerts.yml"
Restart Prometheus:
sudo systemctl restart prometheus
Your alerts are now active! ๐จ
๐ฎ Quick Examples
Example 1: Monitoring Multiple Servers
Want to monitor more servers? Easy peasy!
# In prometheus.yml, add more targets
scrape_configs:
- job_name: 'node'
static_configs:
- targets:
- 'localhost:9100'
- 'server2.example.com:9100'
- 'server3.example.com:9100'
labels:
environment: 'production'
Example 2: Custom Application Metrics
Monitor your own applications:
# Python example with prometheus_client
from prometheus_client import start_http_server, Counter
import time
# Create a metric
REQUEST_COUNT = Counter('app_requests_total', 'Total requests')
# In your application
REQUEST_COUNT.inc() # Increment counter
# Start metrics server
start_http_server(8000)
Add to Prometheus:
- job_name: 'my_app'
static_configs:
- targets: ['localhost:8000']
Example 3: Service Discovery
Automatically discover what to monitor:
# File-based service discovery
scrape_configs:
- job_name: 'dynamic'
file_sd_configs:
- files:
- '/etc/prometheus/targets/*.yml'
refresh_interval: 30s
Create targets file:
# /etc/prometheus/targets/servers.yml
- targets:
- 'web1.example.com:9100'
- 'web2.example.com:9100'
labels:
service: 'web'
env: 'prod'
๐จ Fix Common Problems
Problem 1: Prometheus Wonโt Start
Symptom: Service fails to start ๐ฐ
Fix:
# Check logs
sudo journalctl -u prometheus -n 50
# Validate configuration
promtool check config /etc/prometheus/prometheus.yml
# Check permissions
ls -la /etc/prometheus/
ls -la /var/lib/prometheus/
# Fix permissions if needed
sudo chown -R prometheus:prometheus /etc/prometheus
sudo chown -R prometheus:prometheus /var/lib/prometheus
Problem 2: Targets Show as โDownโ
Symptom: Canโt scrape metrics from targets ๐
Fix:
# Check if exporters are running
sudo systemctl status node_exporter
# Test connectivity
curl http://localhost:9100/metrics
# Check firewall
sudo firewall-cmd --list-all
# Check Prometheus targets page
# Go to http://your-server:9090/targets
Problem 3: High Memory Usage
Symptom: Prometheus eating all RAM! ๐
Fix:
# Reduce retention time in service file
sudo nano /etc/systemd/system/prometheus.service
# Add these flags to ExecStart:
--storage.tsdb.retention.time=15d
--storage.tsdb.retention.size=10GB
# Restart
sudo systemctl daemon-reload
sudo systemctl restart prometheus
๐ Simple Commands Summary
Command | What It Does | When to Use |
---|---|---|
promtool check config | Validate config | Before restart |
curl localhost:9090/-/healthy | Check health | Troubleshooting |
curl localhost:9090/metrics | View Prometheus metrics | Self-monitoring |
promtool query instant | Run queries | Testing PromQL |
systemctl status prometheus | Check service | Verify running |
journalctl -u prometheus -f | Follow logs | Debug issues |
curl localhost:9100/metrics | Node exporter metrics | Check collection |
promtool tsdb analyze | Analyze storage | Optimize disk |
prometheus --version | Check version | Updates |
systemctl restart prometheus | Restart service | Apply changes |
๐ก Tips for Success
๐ Performance Optimization
Make Prometheus blazing fast:
# Optimize storage
echo "vm.swappiness=10" | sudo tee -a /etc/sysctl.conf
echo "vm.vfs_cache_pressure=50" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
# Use SSD for storage if possible
# Mount SSD to /var/lib/prometheus
# Adjust scrape intervals based on needs
# Longer intervals = less resource usage
๐ Essential Metrics to Monitor
Never miss these critical metrics:
- CPU:
node_cpu_seconds_total
- Watch for sustained high usage ๐ฅ - Memory:
node_memory_MemAvailable_bytes
- Prevent OOM kills ๐พ - Disk:
node_filesystem_free_bytes
- Avoid full disks ๐ฟ - Network:
node_network_receive_bytes_total
- Spot traffic spikes ๐ - Load:
node_load1
- Overall system health ๐
๐จ Grafana Integration
Want beautiful dashboards? Install Grafana:
# Add Grafana repo
sudo dnf install -y grafana
# Start Grafana
sudo systemctl start grafana-server
sudo systemctl enable grafana-server
# Access at http://your-server:3000
# Default login: admin/admin
# Add Prometheus as data source
# Use dashboard ID 1860 for Node Exporter Full
๐ What You Learned
Youโre now a Prometheus master! ๐ Youโve successfully:
- โ Installed Prometheus monitoring system
- โ Configured metrics collection
- โ Set up Node Exporter for system metrics
- โ Written PromQL queries
- โ Created smart alerts
- โ Learned troubleshooting techniques
- โ Mastered service discovery
You have eyes on your entire infrastructure! ๐๏ธ
๐ฏ Why This Matters
Prometheus gives you superpowers! With your monitoring system, you can:
- ๐ฎ Predict problems - See issues before users complain!
- ๐ Optimize performance - Find and fix bottlenecks!
- ๐ฐ Save money - Right-size your infrastructure!
- ๐ด Sleep better - Alerts watch while you rest!
- ๐ Make data-driven decisions - Numbers donโt lie!
Youโre not just monitoring servers - youโre ensuring reliability, performance, and user satisfaction! Your infrastructure is now observable, measurable, and manageable! ๐
Keep monitoring, keep improving, and remember - with Prometheus, you see everything! โญ
May your metrics be insightful and your alerts be few! ๐๐๐