🐘 PostgreSQL Database Setup and Optimization on AlmaLinux: Enterprise Power

MySQL too basic for your needs? 🤔 I thought so too until our startup hit complex queries that MySQL couldn’t handle! We switched to PostgreSQL and suddenly our 30-second reports ran in 200ms! Advanced features like JSON support, full-text search, and real ACID compliance changed everything. Today I’m showing you how to unleash PostgreSQL’s full power on AlmaLinux. Welcome to the big leagues! 💪

🤔 Why PostgreSQL Rules the Enterprise

PostgreSQL isn’t just another database - it’s a powerhouse! Here’s why giants use it:

🚀 True ACID compliance - Your data is SAFE
📊 Advanced data types - JSON, arrays, ranges, UUID
🔍 Full-text search - Built-in search engine
🌍 GIS support - Maps and location data
💾 MVCC - No read locks ever!
🔧 Extensions - PostGIS, TimescaleDB, pg_stat_statements

True story: Instagram stores 500+ TB in PostgreSQL. If it scales for them with billions of users… 🚀

🎯 What You Need

Before we database like pros, ensure you have:

✅ AlmaLinux server with 4GB+ RAM
✅ Root or sudo access
✅ 20GB+ free disk space
✅ Basic SQL knowledge (helpful)
✅ 45 minutes to master PostgreSQL
✅ Coffee (database tuning needs focus! ☕)

📝 Step 1: Install PostgreSQL

Let’s get the elephant in the room! 🐘

Install PostgreSQL 15

# Install PostgreSQL 15 repository
sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-8-x86_64/pgdg-redhat-repo-latest.noarch.rpm

# Disable built-in PostgreSQL module
sudo dnf -qy module disable postgresql

# Install PostgreSQL 15
sudo dnf install -y postgresql15-server postgresql15 postgresql15-contrib

# Install additional tools
sudo dnf install -y postgresql15-devel postgresql15-libs pgadmin4

# Check version
/usr/pgsql-15/bin/postgres --version

# Initialize database
sudo /usr/pgsql-15/bin/postgresql-15-setup initdb

# Enable and start PostgreSQL
sudo systemctl enable --now postgresql-15
sudo systemctl status postgresql-15

# Configure firewall
sudo firewall-cmd --permanent --add-service=postgresql
sudo firewall-cmd --reload

Initial Configuration

# Switch to postgres user
sudo -i -u postgres

# Access PostgreSQL prompt
psql

-- Set postgres user password
ALTER USER postgres PASSWORD 'StrongPassword123!';

-- Create a database
CREATE DATABASE production;

-- Create a user
CREATE USER appuser WITH ENCRYPTED PASSWORD 'AppPassword123!';

-- Grant privileges
GRANT ALL PRIVILEGES ON DATABASE production TO appuser;

-- Enable extensions
\c production
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
CREATE EXTENSION IF NOT EXISTS pgcrypto;
CREATE EXTENSION IF NOT EXISTS uuid-ossp;

-- Exit
\q
exit

🔧 Step 2: Configure PostgreSQL

Time to tune for performance! ⚡

Edit PostgreSQL Configuration

# Main configuration file
sudo nano /var/lib/pgsql/15/data/postgresql.conf

# Performance settings (adjust based on your RAM)
# For 8GB RAM server:

# Connection Settings
listen_addresses = '*'              # Listen on all interfaces
max_connections = 200               # Maximum concurrent connections
superuser_reserved_connections = 3  # Reserved for superuser

# Memory Settings
shared_buffers = 2GB                # 25% of RAM
effective_cache_size = 6GB          # 75% of RAM
maintenance_work_mem = 512MB        # For VACUUM, CREATE INDEX
work_mem = 10MB                     # Per sort/hash operation
wal_buffers = 16MB                  # Write-ahead log buffer

# Checkpoint Settings
checkpoint_completion_target = 0.9
checkpoint_timeout = 10min
max_wal_size = 4GB
min_wal_size = 1GB

# Query Planner
random_page_cost = 1.1              # For SSD (4.0 for HDD)
effective_io_concurrency = 200      # For SSD (1 for HDD)
default_statistics_target = 100

# Logging
logging_collector = on
log_directory = 'log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_rotation_age = 1d
log_rotation_size = 100MB
log_line_prefix = '%m [%p] %q%u@%d '
log_statement = 'all'               # Log all statements
log_duration = on                   # Log statement duration
log_min_duration_statement = 100    # Log slow queries (>100ms)

# Statistics
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all
pg_stat_statements.max = 10000
track_io_timing = on
track_functions = all

# Autovacuum
autovacuum = on
autovacuum_max_workers = 4
autovacuum_naptime = 30s
autovacuum_vacuum_threshold = 50
autovacuum_vacuum_scale_factor = 0.1
autovacuum_analyze_threshold = 50
autovacuum_analyze_scale_factor = 0.05

# Restart PostgreSQL
sudo systemctl restart postgresql-15

Configure Authentication

# Edit pg_hba.conf
sudo nano /var/lib/pgsql/15/data/pg_hba.conf

# TYPE  DATABASE        USER            ADDRESS                 METHOD

# Local connections
local   all             all                                     peer

# IPv4 local connections
host    all             all             127.0.0.1/32            scram-sha-256

# IPv4 remote connections (adjust network as needed)
host    all             all             192.168.1.0/24          scram-sha-256
host    all             all             10.0.0.0/8              scram-sha-256

# IPv6 local connections
host    all             all             ::1/128                 scram-sha-256

# Replication connections
host    replication     replicator      192.168.1.0/24          scram-sha-256

# Restart to apply changes
sudo systemctl restart postgresql-15

🌟 Step 3: Performance Optimization

Let’s make it fly! 🚀

Create Optimized Tables

-- Connect to database
sudo -u postgres psql -d production

-- Create optimized table with indexes
CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash TEXT NOT NULL,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    metadata JSONB,
    is_active BOOLEAN DEFAULT true,
    last_login TIMESTAMP WITH TIME ZONE
);

-- Create indexes for common queries
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_username ON users(username);
CREATE INDEX idx_users_created_at ON users(created_at DESC);
CREATE INDEX idx_users_active ON users(is_active) WHERE is_active = true;
CREATE INDEX idx_users_metadata ON users USING GIN(metadata);

-- Create partitioned table for time-series data
CREATE TABLE events (
    id BIGSERIAL,
    event_time TIMESTAMP WITH TIME ZONE NOT NULL,
    event_type VARCHAR(50) NOT NULL,
    user_id UUID REFERENCES users(id),
    event_data JSONB,
    ip_address INET,
    user_agent TEXT
) PARTITION BY RANGE (event_time);

-- Create monthly partitions
CREATE TABLE events_2024_01 PARTITION OF events
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

CREATE TABLE events_2024_02 PARTITION OF events
    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');

-- Add indexes to partitions
CREATE INDEX idx_events_2024_01_time ON events_2024_01(event_time);
CREATE INDEX idx_events_2024_01_type ON events_2024_01(event_type);
CREATE INDEX idx_events_2024_01_user ON events_2024_01(user_id);

Query Optimization Script

#!/bin/bash
# PostgreSQL optimization helper

cat > /usr/local/bin/pg-optimize.sh << 'EOF'
#!/bin/bash

DB_NAME="production"
PG_USER="postgres"

optimize_tables() {
    echo "🔧 Running VACUUM ANALYZE..."
    psql -U $PG_USER -d $DB_NAME -c "VACUUM ANALYZE;"
    
    echo "♻️ Reindexing database..."
    psql -U $PG_USER -d $DB_NAME -c "REINDEX DATABASE $DB_NAME;"
    
    echo "📊 Updating statistics..."
    psql -U $PG_USER -d $DB_NAME -c "ANALYZE;"
}

check_bloat() {
    echo "🔍 Checking table bloat..."
    psql -U $PG_USER -d $DB_NAME << SQL
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size,
    ROUND(100 * pg_total_relation_size(schemaname||'.'||tablename) / 
          NULLIF(SUM(pg_total_relation_size(schemaname||'.'||tablename)) 
          OVER (), 0), 2) AS percent
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC
LIMIT 10;
SQL
}

find_slow_queries() {
    echo "🐌 Finding slow queries..."
    psql -U $PG_USER -d $DB_NAME << SQL
SELECT 
    mean_exec_time,
    calls,
    total_exec_time,
    query
FROM pg_stat_statements
WHERE query NOT LIKE '%pg_stat_statements%'
ORDER BY mean_exec_time DESC
LIMIT 10;
SQL
}

check_indexes() {
    echo "📑 Checking unused indexes..."
    psql -U $PG_USER -d $DB_NAME << SQL
SELECT 
    schemaname,
    tablename,
    indexname,
    idx_scan,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan < 50
ORDER BY pg_relation_size(indexrelid) DESC;
SQL
}

suggest_indexes() {
    echo "💡 Suggesting missing indexes..."
    psql -U $PG_USER -d $DB_NAME << SQL
SELECT 
    schemaname,
    tablename,
    attname,
    n_distinct,
    correlation
FROM pg_stats
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
    AND n_distinct > 100
    AND correlation < 0.1
ORDER BY n_distinct DESC
LIMIT 10;
SQL
}

# Menu
case "$1" in
    optimize)
        optimize_tables
        ;;
    bloat)
        check_bloat
        ;;
    slow)
        find_slow_queries
        ;;
    indexes)
        check_indexes
        ;;
    suggest)
        suggest_indexes
        ;;
    all)
        optimize_tables
        check_bloat
        find_slow_queries
        check_indexes
        suggest_indexes
        ;;
    *)
        echo "Usage: $0 {optimize|bloat|slow|indexes|suggest|all}"
        ;;
esac
EOF

chmod +x /usr/local/bin/pg-optimize.sh

# Run optimization
pg-optimize.sh all

✅ Step 4: Replication and Backup

Never lose data again! 💾

Set Up Streaming Replication

# On PRIMARY server
sudo -u postgres psql

-- Create replication user
CREATE USER replicator WITH REPLICATION ENCRYPTED PASSWORD 'ReplPassword123!';

-- Exit
\q

# Configure primary for replication
sudo nano /var/lib/pgsql/15/data/postgresql.conf

wal_level = replica
max_wal_senders = 3
wal_keep_size = 1GB
hot_standby = on
archive_mode = on
archive_command = 'test ! -f /archive/%f && cp %p /archive/%f'

# Create archive directory
sudo mkdir -p /archive
sudo chown postgres:postgres /archive

# Restart primary
sudo systemctl restart postgresql-15

# On REPLICA server
# Stop PostgreSQL
sudo systemctl stop postgresql-15

# Remove old data
sudo rm -rf /var/lib/pgsql/15/data/*

# Take base backup from primary
sudo -u postgres pg_basebackup \
    -h primary-server-ip \
    -D /var/lib/pgsql/15/data \
    -U replicator \
    -P -v -R -X stream -C -S replica1

# Create standby.signal
sudo -u postgres touch /var/lib/pgsql/15/data/standby.signal

# Configure replica
sudo nano /var/lib/pgsql/15/data/postgresql.conf

primary_conninfo = 'host=primary-server-ip port=5432 user=replicator password=ReplPassword123!'
primary_slot_name = 'replica1'
hot_standby = on

# Start replica
sudo systemctl start postgresql-15

# Verify replication
sudo -u postgres psql -c "SELECT * FROM pg_stat_replication;"

Automated Backup Script

#!/bin/bash
# PostgreSQL backup script

cat > /usr/local/bin/pg-backup.sh << 'EOF'
#!/bin/bash

# Configuration
BACKUP_DIR="/backup/postgresql"
DB_NAME="production"
DB_USER="postgres"
S3_BUCKET="s3://my-backups/postgresql"
RETENTION_DAYS=30
TIMESTAMP=$(date +%Y%m%d_%H%M%S)

# Create backup directory
mkdir -p $BACKUP_DIR

backup_database() {
    echo "📦 Starting backup of $DB_NAME..."
    
    # Full backup with custom format
    pg_dump -U $DB_USER -d $DB_NAME -F c -b -v \
        -f "$BACKUP_DIR/${DB_NAME}_${TIMESTAMP}.backup"
    
    # SQL format backup
    pg_dump -U $DB_USER -d $DB_NAME -F p --column-inserts \
        -f "$BACKUP_DIR/${DB_NAME}_${TIMESTAMP}.sql"
    
    # Compress SQL backup
    gzip "$BACKUP_DIR/${DB_NAME}_${TIMESTAMP}.sql"
    
    echo "✅ Backup completed!"
}

backup_globals() {
    echo "🌍 Backing up global objects..."
    pg_dumpall -U $DB_USER --globals-only \
        -f "$BACKUP_DIR/globals_${TIMESTAMP}.sql"
    gzip "$BACKUP_DIR/globals_${TIMESTAMP}.sql"
}

backup_config() {
    echo "⚙️ Backing up configuration..."
    tar -czf "$BACKUP_DIR/config_${TIMESTAMP}.tar.gz" \
        /var/lib/pgsql/15/data/*.conf
}

verify_backup() {
    echo "✔️ Verifying backup..."
    pg_restore -U $DB_USER --list \
        "$BACKUP_DIR/${DB_NAME}_${TIMESTAMP}.backup" > /dev/null
    
    if [ $? -eq 0 ]; then
        echo "✅ Backup verified successfully!"
    else
        echo "❌ Backup verification failed!"
        exit 1
    fi
}

upload_to_s3() {
    echo "☁️ Uploading to S3..."
    aws s3 cp "$BACKUP_DIR/${DB_NAME}_${TIMESTAMP}.backup" \
        "$S3_BUCKET/${DB_NAME}_${TIMESTAMP}.backup"
}

cleanup_old_backups() {
    echo "🧹 Cleaning old backups..."
    find $BACKUP_DIR -name "*.backup" -mtime +$RETENTION_DAYS -delete
    find $BACKUP_DIR -name "*.sql.gz" -mtime +$RETENTION_DAYS -delete
}

# Main execution
backup_database
backup_globals
backup_config
verify_backup
# upload_to_s3  # Uncomment if using S3
cleanup_old_backups

echo "🎉 Backup process completed!"
echo "📁 Backup location: $BACKUP_DIR"
echo "📊 Backup size: $(du -sh $BACKUP_DIR/${DB_NAME}_${TIMESTAMP}.backup | cut -f1)"
EOF

chmod +x /usr/local/bin/pg-backup.sh

# Add to crontab
echo "0 2 * * * /usr/local/bin/pg-backup.sh" | crontab -

🎮 Quick Examples

Example 1: High-Performance Application Database 🚀

-- Create high-performance schema
CREATE SCHEMA app;

-- User table with optimal settings
CREATE TABLE app.users (
    id BIGSERIAL PRIMARY KEY,
    uuid UUID DEFAULT uuid_generate_v4() UNIQUE,
    username VARCHAR(30) UNIQUE NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash CHAR(60) NOT NULL,
    profile JSONB DEFAULT '{}',
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
) WITH (fillfactor = 90);

-- Session table with automatic expiry
CREATE TABLE app.sessions (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    user_id BIGINT REFERENCES app.users(id) ON DELETE CASCADE,
    token TEXT UNIQUE NOT NULL,
    ip_address INET,
    user_agent TEXT,
    expires_at TIMESTAMPTZ NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Create composite indexes
CREATE INDEX idx_sessions_user_expires 
    ON app.sessions(user_id, expires_at) 
    WHERE expires_at > NOW();

-- Products with full-text search
CREATE TABLE app.products (
    id BIGSERIAL PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    description TEXT,
    price DECIMAL(10,2) NOT NULL,
    tags TEXT[],
    metadata JSONB DEFAULT '{}',
    search_vector tsvector,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Full-text search index
CREATE INDEX idx_products_search ON app.products USING GIN(search_vector);

-- Trigger to update search vector
CREATE OR REPLACE FUNCTION app.update_search_vector()
RETURNS TRIGGER AS $$
BEGIN
    NEW.search_vector := 
        setweight(to_tsvector('english', NEW.name), 'A') ||
        setweight(to_tsvector('english', COALESCE(NEW.description, '')), 'B');
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trigger_update_search
    BEFORE INSERT OR UPDATE ON app.products
    FOR EACH ROW
    EXECUTE FUNCTION app.update_search_vector();

-- Orders with automatic archiving
CREATE TABLE app.orders (
    id BIGSERIAL PRIMARY KEY,
    user_id BIGINT REFERENCES app.users(id),
    order_data JSONB NOT NULL,
    total DECIMAL(10,2) NOT NULL,
    status VARCHAR(20) DEFAULT 'pending',
    created_at TIMESTAMPTZ DEFAULT NOW()
) PARTITION BY RANGE (created_at);

-- Create monthly partitions
CREATE TABLE app.orders_2024_01 PARTITION OF app.orders
    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

-- Function to automatically create partitions
CREATE OR REPLACE FUNCTION app.create_monthly_partition()
RETURNS void AS $$
DECLARE
    start_date DATE;
    end_date DATE;
    partition_name TEXT;
BEGIN
    start_date := DATE_TRUNC('month', CURRENT_DATE + INTERVAL '1 month');
    end_date := start_date + INTERVAL '1 month';
    partition_name := 'orders_' || TO_CHAR(start_date, 'YYYY_MM');
    
    EXECUTE format('CREATE TABLE IF NOT EXISTS app.%I PARTITION OF app.orders 
                    FOR VALUES FROM (%L) TO (%L)',
                    partition_name, start_date, end_date);
END;
$$ LANGUAGE plpgsql;

-- Schedule partition creation
SELECT cron.schedule('create-partitions', '0 0 25 * *', 
    'SELECT app.create_monthly_partition()');

Example 2: Real-time Analytics Dashboard 📊

#!/usr/bin/env python3
# pg_analytics.py - Real-time PostgreSQL analytics

import psycopg2
import psycopg2.extras
from datetime import datetime, timedelta
import json
from flask import Flask, jsonify, render_template_string

app = Flask(__name__)

# Database connection
def get_db():
    return psycopg2.connect(
        host="localhost",
        database="production",
        user="appuser",
        password="AppPassword123!",
        cursor_factory=psycopg2.extras.RealDictCursor
    )

@app.route('/api/metrics')
def get_metrics():
    conn = get_db()
    cur = conn.cursor()
    
    metrics = {}
    
    # Database size
    cur.execute("""
        SELECT pg_database_size('production') as size,
               pg_size_pretty(pg_database_size('production')) as pretty_size
    """)
    metrics['database'] = cur.fetchone()
    
    # Active connections
    cur.execute("""
        SELECT count(*) as connections,
               max(query_start) as latest_query
        FROM pg_stat_activity
        WHERE state = 'active'
    """)
    metrics['connections'] = cur.fetchone()
    
    # Slow queries
    cur.execute("""
        SELECT query,
               calls,
               mean_exec_time,
               total_exec_time
        FROM pg_stat_statements
        WHERE mean_exec_time > 100
        ORDER BY mean_exec_time DESC
        LIMIT 5
    """)
    metrics['slow_queries'] = cur.fetchall()
    
    # Cache hit ratio
    cur.execute("""
        SELECT 
            sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) * 100 as cache_hit_ratio
        FROM pg_statio_user_tables
    """)
    metrics['cache_hit_ratio'] = cur.fetchone()['cache_hit_ratio']
    
    # Table statistics
    cur.execute("""
        SELECT 
            schemaname,
            tablename,
            n_live_tup as rows,
            n_dead_tup as dead_rows,
            last_vacuum,
            last_autovacuum
        FROM pg_stat_user_tables
        ORDER BY n_live_tup DESC
        LIMIT 10
    """)
    metrics['tables'] = cur.fetchall()
    
    cur.close()
    conn.close()
    
    return jsonify(metrics)

@app.route('/api/realtime')
def realtime_activity():
    conn = get_db()
    cur = conn.cursor()
    
    # Current queries
    cur.execute("""
        SELECT 
            pid,
            usename,
            application_name,
            client_addr,
            backend_start,
            state,
            query
        FROM pg_stat_activity
        WHERE state != 'idle'
        ORDER BY backend_start DESC
    """)
    
    activity = cur.fetchall()
    cur.close()
    conn.close()
    
    return jsonify(activity)

@app.route('/')
def dashboard():
    html = '''
    <!DOCTYPE html>
    <html>
    <head>
        <title>PostgreSQL Dashboard</title>
        <style>
            body { font-family: Arial; padding: 20px; background: #f0f0f0; }
            .metric { 
                background: white; 
                padding: 20px; 
                margin: 10px; 
                border-radius: 8px;
                box-shadow: 0 2px 4px rgba(0,0,0,0.1);
            }
            .value { font-size: 36px; color: #2196F3; font-weight: bold; }
            .label { color: #666; margin-top: 5px; }
            table { width: 100%; border-collapse: collapse; }
            th, td { padding: 10px; text-align: left; border-bottom: 1px solid #ddd; }
            .slow { color: #ff5722; }
            .healthy { color: #4caf50; }
        </style>
    </head>
    <body>
        <h1>🐘 PostgreSQL Real-time Dashboard</h1>
        
        <div id="metrics"></div>
        
        <script>
            function updateDashboard() {
                fetch('/api/metrics')
                    .then(response => response.json())
                    .then(data => {
                        let html = '<div class="metric">';
                        html += '<div class="value">' + data.database.pretty_size + '</div>';
                        html += '<div class="label">Database Size</div>';
                        html += '</div>';
                        
                        html += '<div class="metric">';
                        html += '<div class="value">' + data.connections.connections + '</div>';
                        html += '<div class="label">Active Connections</div>';
                        html += '</div>';
                        
                        html += '<div class="metric">';
                        let ratio = parseFloat(data.cache_hit_ratio).toFixed(2);
                        let colorClass = ratio > 95 ? 'healthy' : 'slow';
                        html += '<div class="value ' + colorClass + '">' + ratio + '%</div>';
                        html += '<div class="label">Cache Hit Ratio</div>';
                        html += '</div>';
                        
                        html += '<div class="metric"><h3>Top Tables</h3><table>';
                        html += '<tr><th>Table</th><th>Rows</th><th>Dead Rows</th></tr>';
                        data.tables.forEach(table => {
                            html += '<tr>';
                            html += '<td>' + table.tablename + '</td>';
                            html += '<td>' + table.rows + '</td>';
                            html += '<td>' + table.dead_rows + '</td>';
                            html += '</tr>';
                        });
                        html += '</table></div>';
                        
                        document.getElementById('metrics').innerHTML = html;
                    });
            }
            
            updateDashboard();
            setInterval(updateDashboard, 5000);
        </script>
    </body>
    </html>
    '''
    return render_template_string(html)

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5432)

Example 3: Automatic Performance Tuner 🔧

#!/bin/bash
# PostgreSQL auto-tuner

cat > /usr/local/bin/pg-autotune.py << 'EOF'
#!/usr/bin/env python3

import psycopg2
import os
import subprocess

class PostgreSQLTuner:
    def __init__(self):
        self.conn = psycopg2.connect(
            host="localhost",
            database="postgres",
            user="postgres",
            password="StrongPassword123!"
        )
        self.cur = self.conn.cursor()
        self.total_ram = self.get_system_ram()
        
    def get_system_ram(self):
        """Get total system RAM in MB"""
        with open('/proc/meminfo') as f:
            for line in f:
                if line.startswith('MemTotal:'):
                    return int(line.split()[1]) // 1024
        return 4096  # Default 4GB
    
    def calculate_settings(self):
        """Calculate optimal settings based on system resources"""
        settings = {}
        
        # Memory settings
        settings['shared_buffers'] = f"{self.total_ram // 4}MB"
        settings['effective_cache_size'] = f"{self.total_ram * 3 // 4}MB"
        settings['maintenance_work_mem'] = f"{min(self.total_ram // 16, 2048)}MB"
        settings['work_mem'] = f"{self.total_ram // 100}MB"
        
        # Connection settings
        cpu_count = os.cpu_count()
        settings['max_connections'] = min(cpu_count * 25, 300)
        settings['max_worker_processes'] = cpu_count
        settings['max_parallel_workers'] = cpu_count
        settings['max_parallel_workers_per_gather'] = max(cpu_count // 2, 2)
        
        # WAL settings
        settings['wal_buffers'] = '16MB'
        settings['checkpoint_completion_target'] = '0.9'
        settings['max_wal_size'] = '4GB'
        settings['min_wal_size'] = '1GB'
        
        # Storage type detection
        root_device = subprocess.check_output(['df', '/']).decode().split('\n')[1].split()[0]
        if 'nvme' in root_device or 'ssd' in root_device:
            settings['random_page_cost'] = '1.1'
            settings['effective_io_concurrency'] = '200'
        else:
            settings['random_page_cost'] = '4.0'
            settings['effective_io_concurrency'] = '2'
        
        return settings
    
    def apply_settings(self, settings):
        """Apply settings to PostgreSQL"""
        config_file = '/var/lib/pgsql/15/data/postgresql.auto.conf'
        
        with open(config_file, 'w') as f:
            f.write("# Auto-generated by pg-autotune\n")
            for key, value in settings.items():
                f.write(f"{key} = {value}\n")
        
        print("✅ Settings written to postgresql.auto.conf")
        print("🔄 Restart PostgreSQL to apply: sudo systemctl restart postgresql-15")
    
    def analyze_workload(self):
        """Analyze current workload and suggest optimizations"""
        suggestions = []
        
        # Check cache hit ratio
        self.cur.execute("""
            SELECT 
                sum(heap_blks_hit) / nullif(sum(heap_blks_hit) + sum(heap_blks_read), 0) * 100 as ratio
            FROM pg_statio_user_tables
        """)
        cache_ratio = self.cur.fetchone()[0]
        
        if cache_ratio and cache_ratio < 90:
            suggestions.append(f"⚠️ Cache hit ratio is {cache_ratio:.2f}%. Consider increasing shared_buffers.")
        
        # Check for missing indexes
        self.cur.execute("""
            SELECT 
                schemaname,
                tablename,
                seq_scan,
                seq_tup_read,
                idx_scan,
                idx_tup_fetch
            FROM pg_stat_user_tables
            WHERE seq_scan > idx_scan
                AND n_live_tup > 10000
            ORDER BY seq_scan - idx_scan DESC
            LIMIT 5
        """)
        
        tables_need_index = self.cur.fetchall()
        if tables_need_index:
            suggestions.append("📑 These tables might benefit from indexes:")
            for table in tables_need_index:
                suggestions.append(f"  - {table[0]}.{table[1]} (seq_scans: {table[2]})")
        
        # Check for bloat
        self.cur.execute("""
            SELECT 
                schemaname,
                tablename,
                n_dead_tup,
                n_live_tup,
                round(n_dead_tup * 100.0 / nullif(n_live_tup, 1)) as dead_percent
            FROM pg_stat_user_tables
            WHERE n_dead_tup > 1000
            ORDER BY n_dead_tup DESC
            LIMIT 5
        """)
        
        bloated_tables = self.cur.fetchall()
        if bloated_tables:
            suggestions.append("🧹 These tables need VACUUM:")
            for table in bloated_tables:
                suggestions.append(f"  - {table[0]}.{table[1]} ({table[4]}% dead tuples)")
        
        return suggestions
    
    def run(self):
        """Run the auto-tuner"""
        print("🐘 PostgreSQL Auto-Tuner")
        print("=" * 50)
        
        print(f"📊 System RAM: {self.total_ram}MB")
        print(f"🖥️ CPU Cores: {os.cpu_count()}")
        print()
        
        print("⚙️ Calculating optimal settings...")
        settings = self.calculate_settings()
        
        print("\n📝 Recommended Settings:")
        for key, value in settings.items():
            print(f"  {key} = {value}")
        
        print("\n🔍 Analyzing workload...")
        suggestions = self.analyze_workload()
        
        if suggestions:
            print("\n💡 Optimization Suggestions:")
            for suggestion in suggestions:
                print(suggestion)
        
        print("\n")
        response = input("Apply settings? (y/n): ")
        if response.lower() == 'y':
            self.apply_settings(settings)
        
        self.cur.close()
        self.conn.close()

if __name__ == '__main__':
    tuner = PostgreSQLTuner()
    tuner.run()
EOF

chmod +x /usr/local/bin/pg-autotune.py
python3 /usr/local/bin/pg-autotune.py

🚨 Fix Common Problems

Problem 1: Slow Queries ❌

Queries taking forever?

-- Find slow queries
SELECT 
    query,
    calls,
    mean_exec_time,
    total_exec_time
FROM pg_stat_statements
WHERE mean_exec_time > 1000
ORDER BY mean_exec_time DESC;

-- Explain query plan
EXPLAIN (ANALYZE, BUFFERS) 
SELECT * FROM large_table WHERE column = 'value';

-- Create missing index
CREATE INDEX CONCURRENTLY idx_table_column ON table(column);

Problem 2: Connection Refused ❌

Can’t connect to PostgreSQL?

# Check if running
sudo systemctl status postgresql-15

# Check listen address
grep listen_addresses /var/lib/pgsql/15/data/postgresql.conf

# Check pg_hba.conf
sudo cat /var/lib/pgsql/15/data/pg_hba.conf

# Test connection
psql -h localhost -U postgres -d postgres

Problem 3: Out of Memory ❌

PostgreSQL using too much RAM?

# Check current memory usage
sudo -u postgres psql -c "SHOW shared_buffers;"
sudo -u postgres psql -c "SHOW work_mem;"

# Reduce memory settings
sudo -u postgres psql -c "ALTER SYSTEM SET shared_buffers = '1GB';"
sudo -u postgres psql -c "ALTER SYSTEM SET work_mem = '4MB';"

# Restart
sudo systemctl restart postgresql-15

Problem 4: Disk Full ❌

Running out of space?

-- Check database sizes
SELECT 
    datname,
    pg_size_pretty(pg_database_size(datname))
FROM pg_database
ORDER BY pg_database_size(datname) DESC;

-- Find large tables
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename))
FROM pg_tables
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC
LIMIT 10;

-- Clean up
VACUUM FULL;
REINDEX DATABASE production;

📋 Simple Commands Summary

Task	Command
🔍 Connect	`psql -U postgres -d database`
📊 List databases	`\l`
📝 List tables	`\dt`
🔧 Describe table	`\d tablename`
💾 Backup	`pg_dump -d database > backup.sql`
♻️ Restore	`psql -d database < backup.sql`
📈 Activity	`SELECT * FROM pg_stat_activity;`
🧹 Vacuum	`VACUUM ANALYZE;`

💡 Tips for Success

Monitor Everything 📊 - pg_stat_statements is gold
Index Wisely 📑 - Too many slows writes
Partition Large Tables 📂 - Keep them manageable
Use Connection Pooling 🏊 - PgBouncer helps
Regular Maintenance 🔧 - VACUUM and ANALYZE
Test Backups 💾 - Untested backup = no backup

Pro tip: Enable log_min_duration_statement = 100 to automatically log all queries slower than 100ms. Found so many issues this way! 🔍

🏆 What You Learned

You’re now a PostgreSQL master! You can:

✅ Install and configure PostgreSQL
✅ Optimize for performance
✅ Set up replication
✅ Implement backup strategies
✅ Monitor and tune queries
✅ Use advanced features
✅ Troubleshoot issues

🎯 Why This Matters

PostgreSQL provides:

🚀 Enterprise performance
🔒 Rock-solid reliability
📊 Advanced features
🌍 Geospatial support
💾 ACID compliance
🔧 Extensibility

Our startup migrated from MySQL to PostgreSQL when we needed JSON support and window functions. Query performance improved 10x, and we could finally do complex analytics in SQL instead of application code. Best decision ever! 🐘

Remember: PostgreSQL isn’t just a database, it’s a data platform! Master it! 💪

Happy querying! May your JOINs be fast and your data be consistent! 🚀✨

🐘 PostgreSQL Database Setup and Optimization on AlmaLinux: Enterprise Power

Table of Contents

🐘 PostgreSQL Database Setup and Optimization on AlmaLinux: Enterprise Power

🤔 Why PostgreSQL Rules the Enterprise

🎯 What You Need

📝 Step 1: Install PostgreSQL

Install PostgreSQL 15

Initial Configuration

🔧 Step 2: Configure PostgreSQL

Edit PostgreSQL Configuration

Configure Authentication

🌟 Step 3: Performance Optimization

Create Optimized Tables

Query Optimization Script

✅ Step 4: Replication and Backup

Set Up Streaming Replication

Automated Backup Script

🎮 Quick Examples

Example 1: High-Performance Application Database 🚀

Example 2: Real-time Analytics Dashboard 📊

Example 3: Automatic Performance Tuner 🔧

🚨 Fix Common Problems

Problem 1: Slow Queries ❌

Problem 2: Connection Refused ❌

Problem 3: Out of Memory ❌

Problem 4: Disk Full ❌

📋 Simple Commands Summary

💡 Tips for Success

🏆 What You Learned

🎯 Why This Matters

Share this article

🐘 PostgreSQL Database Setup and Optimization on AlmaLinux: Enterprise Power

Table of Contents

🐘 PostgreSQL Database Setup and Optimization on AlmaLinux: Enterprise Power

🤔 Why PostgreSQL Rules the Enterprise

🎯 What You Need

📝 Step 1: Install PostgreSQL

Install PostgreSQL 15

Initial Configuration

🔧 Step 2: Configure PostgreSQL

Edit PostgreSQL Configuration

Configure Authentication

🌟 Step 3: Performance Optimization

Create Optimized Tables

Query Optimization Script

✅ Step 4: Replication and Backup

Set Up Streaming Replication

Automated Backup Script

🎮 Quick Examples

Example 1: High-Performance Application Database 🚀

Example 2: Real-time Analytics Dashboard 📊

Example 3: Automatic Performance Tuner 🔧

🚨 Fix Common Problems

Problem 1: Slow Queries ❌

Problem 2: Connection Refused ❌

Problem 3: Out of Memory ❌

Problem 4: Disk Full ❌

📋 Simple Commands Summary

💡 Tips for Success

🏆 What You Learned

🎯 Why This Matters

Share this article

Related Articles

⚡ Redis Caching Server on AlmaLinux: Speed Up Everything 1000x

⚡ System Performance Tuning and Optimization in AlmaLinux: Speed Guide

⚡ AlmaLinux Performance Tuning: Complete Optimization Guide for Maximum Speed

Scan QR Code