Remote monitoring is like having eyes everywhere on your systems! π It lets you track performance, spot problems, and get alerts from anywhere in the world. Letβs set up a complete remote monitoring solution on Alpine Linux that will keep you informed about your infrastructureβs health! π
What is Remote Monitoring? π€
Remote monitoring includes:
- Metrics collection - Gathering system data
- Data storage - Keeping historical information
- Visualization - Creating beautiful dashboards
- Alerting - Getting notified of problems
- Remote access - Monitoring from anywhere
Think of it as a health tracking system for your servers! π₯
Installing Prometheus π¦
Letβs start with Prometheus for metrics collection:
# Update package list
sudo apk update
# Install Prometheus
sudo apk add prometheus prometheus-node-exporter
# Create Prometheus user and directories
sudo adduser -S prometheus -G prometheus
sudo mkdir -p /etc/prometheus /var/lib/prometheus
sudo chown prometheus:prometheus /var/lib/prometheus
Configuring Prometheus π§
Set up Prometheus configuration:
# Create Prometheus config
sudo cat > /etc/prometheus/prometheus.yml << 'EOF'
global:
scrape_interval: 15s
evaluation_interval: 15s
alerting:
alertmanagers:
- static_configs:
- targets: []
rule_files:
- "alerts.yml"
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node'
static_configs:
- targets: ['localhost:9100']
- job_name: 'alpine-servers'
static_configs:
- targets:
- 'server1.example.com:9100'
- 'server2.example.com:9100'
- 'server3.example.com:9100'
EOF
# Set permissions
sudo chown -R prometheus:prometheus /etc/prometheus
Setting Up Node Exporter π‘
Configure system metrics collection:
# Configure node exporter
sudo rc-service prometheus-node-exporter start
sudo rc-update add prometheus-node-exporter
# Create systemd-style service for Prometheus
sudo cat > /etc/init.d/prometheus << 'EOF'
#!/sbin/openrc-run
name="prometheus"
description="Prometheus Server"
command="/usr/bin/prometheus"
command_args="--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/var/lib/prometheus/ \
--web.console.templates=/usr/share/prometheus/consoles \
--web.console.libraries=/usr/share/prometheus/console_libraries"
command_user="prometheus:prometheus"
pidfile="/run/${RC_SVCNAME}.pid"
command_background=true
depend() {
need net
}
EOF
sudo chmod +x /etc/init.d/prometheus
sudo rc-service prometheus start
sudo rc-update add prometheus
Installing Grafana π
Set up Grafana for visualization:
# Add Grafana repository
echo "@community https://dl-cdn.alpinelinux.org/alpine/edge/community" >> /etc/apk/repositories
# Install Grafana
sudo apk update
sudo apk add grafana@community
# Start Grafana
sudo rc-service grafana start
sudo rc-update add grafana
# Access Grafana at http://localhost:3000
# Default login: admin/admin
Creating Monitoring Dashboards π¨
Build custom dashboards:
# Create dashboard config via API
cat > ~/system_dashboard.json << 'EOF'
{
"dashboard": {
"title": "Alpine Linux System Metrics",
"panels": [
{
"title": "CPU Usage",
"targets": [
{
"expr": "100 - (avg(rate(node_cpu_seconds_total{mode=\"idle\"}[5m])) * 100)"
}
],
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 0}
},
{
"title": "Memory Usage",
"targets": [
{
"expr": "(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100"
}
],
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 0}
},
{
"title": "Disk Usage",
"targets": [
{
"expr": "100 - (node_filesystem_avail_bytes{mountpoint=\"/\"} / node_filesystem_size_bytes{mountpoint=\"/\"} * 100)"
}
],
"gridPos": {"h": 8, "w": 12, "x": 0, "y": 8}
},
{
"title": "Network Traffic",
"targets": [
{
"expr": "rate(node_network_receive_bytes_total[5m])"
}
],
"gridPos": {"h": 8, "w": 12, "x": 12, "y": 8}
}
]
}
}
EOF
# Import dashboard via API
curl -X POST http://admin:admin@localhost:3000/api/dashboards/db \
-H "Content-Type: application/json" \
-d @~/system_dashboard.json
Setting Up Remote Access π
Configure secure remote access:
# Install nginx for reverse proxy
sudo apk add nginx
# Configure nginx for Grafana
sudo cat > /etc/nginx/conf.d/grafana.conf << 'EOF'
server {
listen 80;
server_name monitoring.example.com;
location / {
proxy_pass http://localhost:3000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
server {
listen 80;
server_name prometheus.example.com;
location / {
proxy_pass http://localhost:9090;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
EOF
sudo nginx -t
sudo rc-service nginx start
sudo rc-update add nginx
Creating Alert Rules π¨
Set up monitoring alerts:
# Create alert rules
sudo cat > /etc/prometheus/alerts.yml << 'EOF'
groups:
- name: system_alerts
rules:
- alert: HighCPUUsage
expr: 100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU usage detected"
description: "CPU usage is above 80% (current value: {{ $value }}%)"
- alert: HighMemoryUsage
expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 90
for: 5m
labels:
severity: critical
annotations:
summary: "High memory usage detected"
description: "Memory usage is above 90% (current value: {{ $value }}%)"
- alert: DiskSpaceLow
expr: node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"} * 100 < 10
for: 5m
labels:
severity: critical
annotations:
summary: "Low disk space"
description: "Less than 10% disk space remaining"
- alert: SystemDown
expr: up == 0
for: 1m
labels:
severity: critical
annotations:
summary: "System is down"
description: "{{ $labels.instance }} is not responding"
EOF
Installing Alertmanager π’
Set up alert notifications:
# Install Alertmanager
sudo apk add alertmanager
# Configure Alertmanager
sudo cat > /etc/alertmanager/alertmanager.yml << 'EOF'
global:
resolve_timeout: 5m
route:
group_by: ['alertname', 'cluster', 'service']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'default-receiver'
receivers:
- name: 'default-receiver'
email_configs:
- to: '[email protected]'
from: '[email protected]'
smarthost: 'smtp.example.com:587'
auth_username: '[email protected]'
auth_password: 'password'
webhook_configs:
- url: 'http://localhost:5001/webhook'
send_resolved: true
EOF
# Start Alertmanager
sudo rc-service alertmanager start
sudo rc-update add alertmanager
Custom Monitoring Scripts π
Create specialized monitoring tools:
# System health check script
cat > ~/monitor_health.sh << 'EOF'
#!/bin/sh
# System Health Monitor
# Prometheus pushgateway endpoint
PUSHGATEWAY="http://localhost:9091"
JOB="system_health"
INSTANCE="$(hostname)"
# Collect metrics
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)
MEMORY_USAGE=$(free | grep Mem | awk '{print ($2-$7)/$2 * 100}')
DISK_USAGE=$(df -h / | tail -1 | awk '{print $5}' | sed 's/%//')
LOAD_AVG=$(uptime | awk -F'load average:' '{print $2}' | awk '{print $1}' | sed 's/,//')
# Push to Prometheus
cat << METRICS | curl --data-binary @- ${PUSHGATEWAY}/metrics/job/${JOB}/instance/${INSTANCE}
# TYPE cpu_usage gauge
cpu_usage ${CPU_USAGE}
# TYPE memory_usage gauge
memory_usage ${MEMORY_USAGE}
# TYPE disk_usage gauge
disk_usage ${DISK_USAGE}
# TYPE load_average gauge
load_average ${LOAD_AVG}
METRICS
echo "Metrics pushed successfully!"
EOF
chmod +x ~/monitor_health.sh
# Add to crontab
echo "*/5 * * * * /home/user/monitor_health.sh" | crontab -
Remote Log Aggregation π
Collect logs from multiple systems:
# Install Loki for log aggregation
wget https://github.com/grafana/loki/releases/download/v2.9.0/loki-linux-amd64.zip
unzip loki-linux-amd64.zip
sudo mv loki-linux-amd64 /usr/local/bin/loki
# Configure Loki
cat > /etc/loki/config.yml << 'EOF'
auth_enabled: false
server:
http_listen_port: 3100
ingester:
lifecycler:
address: 127.0.0.1
ring:
kvstore:
store: inmemory
replication_factor: 1
schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /var/lib/loki/boltdb-shipper-active
cache_location: /var/lib/loki/boltdb-shipper-cache
shared_store: filesystem
filesystem:
directory: /var/lib/loki/chunks
limits_config:
enforce_metric_name: false
reject_old_samples: true
reject_old_samples_max_age: 168h
EOF
Mobile Monitoring Access π±
Set up mobile-friendly monitoring:
# Create mobile dashboard
cat > ~/mobile_dashboard.html << 'EOF'
<!DOCTYPE html>
<html>
<head>
<title>System Monitor</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
<style>
body { font-family: Arial; margin: 20px; }
.metric { background: #f0f0f0; padding: 15px; margin: 10px 0; border-radius: 5px; }
.value { font-size: 2em; font-weight: bold; }
.label { color: #666; }
.critical { color: red; }
.warning { color: orange; }
.ok { color: green; }
</style>
</head>
<body>
<h1>System Status</h1>
<div id="metrics"></div>
<script>
async function fetchMetrics() {
const response = await fetch('/api/v1/query?query=up');
const data = await response.json();
updateDisplay(data);
}
function updateDisplay(data) {
// Update metric displays
document.getElementById('metrics').innerHTML =
data.data.result.map(m =>
`<div class="metric">
<div class="label">${m.metric.job}</div>
<div class="value ${m.value[1] == '1' ? 'ok' : 'critical'}">
${m.value[1] == '1' ? 'UP' : 'DOWN'}
</div>
</div>`
).join('');
}
// Refresh every 30 seconds
setInterval(fetchMetrics, 30000);
fetchMetrics();
</script>
</body>
</html>
EOF
Monitoring Best Practices π
- Set reasonable intervals - Donβt overload systems
- Store data efficiently - Use retention policies
- Secure access - Use HTTPS and authentication
- Test alerts - Ensure notifications work
- Document dashboards - Explain what metrics mean
Troubleshooting π§
Prometheus Not Collecting Data
# Check targets
curl http://localhost:9090/api/v1/targets
# Verify exporters are running
ps aux | grep exporter
# Check firewall rules
sudo iptables -L -n
Grafana Connection Issues
# Test data source
curl http://localhost:9090/api/v1/query?query=up
# Check Grafana logs
sudo tail -f /var/log/grafana/grafana.log
Quick Commands π
# Check Prometheus status
curl http://localhost:9090/-/healthy
# Reload Prometheus config
curl -X POST http://localhost:9090/-/reload
# Test alert rules
promtool check rules /etc/prometheus/alerts.yml
# Export Grafana dashboards
curl http://admin:admin@localhost:3000/api/dashboards/uid/xyz > dashboard.json
Conclusion π―
You now have a complete remote monitoring solution on Alpine Linux! With Prometheus collecting metrics, Grafana displaying beautiful dashboards, and alerts keeping you informed, you can monitor your infrastructure from anywhere. Remember to customize dashboards for your specific needs. Happy monitoring! πβ¨