📘 Logging: ELK Stack

🎯 Introduction

Welcome to this exciting tutorial on the ELK Stack with Python! 🎉 In this guide, we’ll explore how to implement powerful centralized logging for your Python applications using Elasticsearch, Logstash, and Kibana.

You’ll discover how the ELK Stack can transform your logging experience from scattered log files into a powerful, searchable, and visualizable system. Whether you’re building microservices 🌐, debugging production issues 🐛, or monitoring application health 📊, understanding ELK Stack integration is essential for modern Python development.

By the end of this tutorial, you’ll feel confident implementing comprehensive logging solutions in your own projects! Let’s dive in! 🏊‍♂️

📚 Understanding the ELK Stack

🤔 What is the ELK Stack?

The ELK Stack is like having a super-powered detective agency for your logs 🕵️‍♂️. Think of it as a three-person team where each member has a special skill:

Elasticsearch 🔍: The search expert who can find any log instantly
Logstash 📥: The organizer who collects and processes logs
Kibana 📊: The artist who creates beautiful visualizations

In Python terms, the ELK Stack helps you:

✨ Centralize logs from multiple applications
🚀 Search through millions of logs in milliseconds
🛡️ Monitor application health in real-time
📈 Create dashboards and alerts
🎯 Debug issues faster with powerful queries

💡 Why Use ELK Stack with Python?

Here’s why developers love ELK Stack for logging:

Scalability 🚀: Handle logs from one app or thousands
Real-time Processing ⚡: See logs as they happen
Powerful Search 🔍: Find specific logs instantly
Beautiful Dashboards 📊: Visualize trends and patterns
Alerting 🚨: Get notified when things go wrong

Real-world example: Imagine monitoring an e-commerce platform 🛒. With ELK Stack, you can track user actions, system errors, and performance metrics all in one place!

🔧 Basic Syntax and Usage

📝 Setting Up Python for ELK

Let’s start with a friendly example of sending logs to Elasticsearch:

# 👋 Hello, ELK Stack!
import logging
from elasticsearch import Elasticsearch
from pythonjsonlogger import jsonlogger
import datetime

# 🎨 Create Elasticsearch connection
es = Elasticsearch(['localhost:9200'])

# 🛠️ Custom handler for Elasticsearch
class ElasticsearchHandler(logging.Handler):
    def __init__(self, es_client, index_name='python-logs'):
        super().__init__()
        self.es_client = es_client
        self.index_name = index_name
    
    def emit(self, record):
        # 📝 Convert log record to dict
        log_entry = {
            'timestamp': datetime.datetime.utcnow(),
            'level': record.levelname,
            'logger': record.name,
            'message': record.getMessage(),
            'module': record.module,
            'function': record.funcName,
            'line': record.lineno
        }
        
        # 🚀 Send to Elasticsearch
        self.es_client.index(
            index=f"{self.index_name}-{datetime.date.today()}",
            body=log_entry
        )

# 🎯 Set up logging
logger = logging.getLogger('my_app')
logger.setLevel(logging.INFO)

# ➕ Add Elasticsearch handler
es_handler = ElasticsearchHandler(es)
logger.addHandler(es_handler)

# 🎉 Log some messages!
logger.info("Application started! 🚀")
logger.warning("This is a warning ⚠️")

💡 Explanation: We create a custom handler that sends each log entry to Elasticsearch with timestamps and metadata!

🎯 Using Python-Logstash

Here’s how to send logs to Logstash:

# 🏗️ Using python-logstash library
import logstash
import logging

# 🎨 Create logger
logger = logging.getLogger('python-logstash-logger')
logger.setLevel(logging.INFO)

# 🔄 Add Logstash handler
logstash_handler = logstash.TCPLogstashHandler(
    host='localhost',
    port=5959,
    version=1  # 📌 Logstash version
)
logger.addHandler(logstash_handler)

# 📊 Log with extra fields
extra = {
    'user_id': '12345',
    'action': 'purchase',
    'product': 'Python Book 📘',
    'price': 29.99
}

logger.info('User made a purchase! 🛒', extra=extra)

💡 Practical Examples

🛒 Example 1: E-commerce Application Logging

Let’s build a comprehensive logging system for an online store:

# 🛍️ E-commerce logging system
import logging
import json
from datetime import datetime
from elasticsearch import Elasticsearch
import logstash

class EcommerceLogger:
    def __init__(self):
        # 🎨 Set up Elasticsearch
        self.es = Elasticsearch(['localhost:9200'])
        
        # 📝 Create logger
        self.logger = logging.getLogger('ecommerce')
        self.logger.setLevel(logging.INFO)
        
        # 🚀 Add Logstash handler
        logstash_handler = logstash.TCPLogstashHandler(
            host='localhost', 
            port=5959
        )
        self.logger.addHandler(logstash_handler)
    
    def log_user_action(self, user_id, action, details):
        # 🎯 Log user activities
        log_data = {
            'timestamp': datetime.utcnow().isoformat(),
            'user_id': user_id,
            'action': action,
            'details': details,
            'session_id': self._get_session_id()
        }
        
        self.logger.info(f"User action: {action} 🎯", extra=log_data)
    
    def log_purchase(self, user_id, items, total):
        # 💰 Log purchase events
        purchase_data = {
            'user_id': user_id,
            'items': items,
            'total': total,
            'timestamp': datetime.utcnow().isoformat()
        }
        
        # 📊 Send to Elasticsearch for analytics
        self.es.index(
            index='purchases',
            body=purchase_data
        )
        
        self.logger.info(f"Purchase completed! 🛒 Total: ${total}", 
                        extra=purchase_data)
    
    def log_error(self, error_type, message, stack_trace=None):
        # 🚨 Log errors with context
        error_data = {
            'error_type': error_type,
            'message': message,
            'stack_trace': stack_trace,
            'timestamp': datetime.utcnow().isoformat()
        }
        
        self.logger.error(f"Error occurred: {error_type} 💥", 
                         extra=error_data)
    
    def _get_session_id(self):
        # 🎲 Simulate session ID
        return f"session_{datetime.now().timestamp()}"

# 🎮 Let's use it!
logger = EcommerceLogger()

# 👤 Log user browsing
logger.log_user_action(
    user_id="user_123",
    action="view_product",
    details={'product_id': 'py_book_001', 'category': 'books'}
)

# 🛒 Log purchase
items = [
    {'name': 'Python Cookbook 📚', 'price': 45.99},
    {'name': 'ELK Stack Guide 📖', 'price': 39.99}
]
logger.log_purchase(
    user_id="user_123",
    items=items,
    total=85.98
)

🎯 Try it yourself: Add cart abandonment tracking and performance metrics logging!

🎮 Example 2: Microservices Log Aggregation

Let’s create a logging system for microservices:

# 🏆 Microservices logging with correlation
import logging
import uuid
from contextvars import ContextVar
from pythonjsonlogger import jsonlogger
import logstash

# 🎯 Correlation ID for request tracking
correlation_id = ContextVar('correlation_id', default=None)

class MicroserviceLogger:
    def __init__(self, service_name):
        self.service_name = service_name
        self.logger = self._setup_logger()
    
    def _setup_logger(self):
        # 📝 Create logger with JSON formatter
        logger = logging.getLogger(self.service_name)
        logger.setLevel(logging.INFO)
        
        # 🎨 JSON formatter for structured logs
        json_handler = logging.StreamHandler()
        formatter = jsonlogger.JsonFormatter()
        json_handler.setFormatter(formatter)
        
        # 🚀 Logstash handler
        logstash_handler = logstash.TCPLogstashHandler(
            host='logstash.internal',
            port=5959
        )
        
        logger.addHandler(json_handler)
        logger.addHandler(logstash_handler)
        
        return logger
    
    def _get_base_fields(self):
        # 🏷️ Add common fields to all logs
        return {
            'service': self.service_name,
            'correlation_id': correlation_id.get() or str(uuid.uuid4()),
            'environment': 'production',
            'version': '1.0.0'
        }
    
    def info(self, message, **kwargs):
        # ✨ Log info with context
        extra = {**self._get_base_fields(), **kwargs}
        self.logger.info(message, extra=extra)
    
    def error(self, message, exception=None, **kwargs):
        # 🚨 Log errors with exception details
        extra = {**self._get_base_fields(), **kwargs}
        if exception:
            extra['exception_type'] = type(exception).__name__
            extra['exception_message'] = str(exception)
        
        self.logger.error(message, extra=extra, exc_info=exception)
    
    def log_api_request(self, method, path, duration_ms, status_code):
        # 📊 Log API metrics
        self.info(
            f"API Request: {method} {path}",
            method=method,
            path=path,
            duration_ms=duration_ms,
            status_code=status_code,
            request_type='api'
        )
    
    def log_database_query(self, query, duration_ms, rows_affected):
        # 🗄️ Log database operations
        self.info(
            "Database query executed",
            query=query[:100],  # 📏 Truncate long queries
            duration_ms=duration_ms,
            rows_affected=rows_affected,
            operation_type='database'
        )

# 🎮 Example usage in a Flask microservice
from flask import Flask, request, g
import time

app = Flask(__name__)
logger = MicroserviceLogger('user-service')

@app.before_request
def before_request():
    # 🎯 Set correlation ID for request
    request_id = request.headers.get('X-Correlation-ID', str(uuid.uuid4()))
    correlation_id.set(request_id)
    g.start_time = time.time()
    
    logger.info(
        "Request started",
        method=request.method,
        path=request.path,
        remote_addr=request.remote_addr
    )

@app.after_request
def after_request(response):
    # 📊 Log request completion
    duration = (time.time() - g.start_time) * 1000
    logger.log_api_request(
        method=request.method,
        path=request.path,
        duration_ms=duration,
        status_code=response.status_code
    )
    return response

@app.route('/users/<user_id>')
def get_user(user_id):
    try:
        # 📝 Log business logic
        logger.info(f"Fetching user 👤", user_id=user_id)
        
        # 🗄️ Simulate database query
        start = time.time()
        user = fetch_user_from_db(user_id)  # Your DB function
        duration = (time.time() - start) * 1000
        
        logger.log_database_query(
            query=f"SELECT * FROM users WHERE id = {user_id}",
            duration_ms=duration,
            rows_affected=1
        )
        
        return {'user': user, 'status': 'success ✅'}
    
    except Exception as e:
        # 🚨 Log errors with full context
        logger.error(
            f"Failed to fetch user 💥",
            exception=e,
            user_id=user_id
        )
        return {'error': 'User not found'}, 404

🚀 Advanced Concepts

🧙‍♂️ Advanced Topic 1: Custom Log Enrichment

When you’re ready to level up, try this advanced pattern:

# 🎯 Advanced log enrichment with context
import psutil
import platform
from functools import wraps

class EnrichedLogger:
    def __init__(self, service_name):
        self.service_name = service_name
        self.logger = self._setup_logger()
        self.enrichers = []
    
    def add_enricher(self, enricher_func):
        # ➕ Add custom enrichment functions
        self.enrichers.append(enricher_func)
    
    def _enrich_log_data(self, data):
        # ✨ Apply all enrichers
        enriched = data.copy()
        
        # 📊 System metrics
        enriched.update({
            'cpu_percent': psutil.cpu_percent(),
            'memory_percent': psutil.virtual_memory().percent,
            'hostname': platform.node(),
            'python_version': platform.python_version()
        })
        
        # 🎨 Apply custom enrichers
        for enricher in self.enrichers:
            enriched.update(enricher())
        
        return enriched
    
    def log_with_timing(self, func):
        # ⏱️ Decorator for automatic timing
        @wraps(func)
        def wrapper(*args, **kwargs):
            start = time.time()
            result = None
            error = None
            
            try:
                result = func(*args, **kwargs)
                return result
            except Exception as e:
                error = e
                raise
            finally:
                duration = (time.time() - start) * 1000
                
                log_data = {
                    'function': func.__name__,
                    'duration_ms': duration,
                    'success': error is None
                }
                
                if error:
                    self.error(
                        f"Function {func.__name__} failed 💥",
                        exception=error,
                        **log_data
                    )
                else:
                    self.info(
                        f"Function {func.__name__} completed ✅",
                        **log_data
                    )
        
        return wrapper

# 🪄 Using the enriched logger
logger = EnrichedLogger('analytics-service')

# 🎨 Add custom enrichers
def user_context_enricher():
    # 👤 Add user context to logs
    return {
        'user_tier': get_current_user_tier(),
        'feature_flags': get_active_feature_flags()
    }

logger.add_enricher(user_context_enricher)

# ⏱️ Use timing decorator
@logger.log_with_timing
def process_analytics_batch(batch_id):
    # 📊 Process analytics data
    logger.info(f"Processing batch 📦", batch_id=batch_id)
    # ... processing logic ...
    return "processed"

🏗️ Advanced Topic 2: Log Pipeline with Filters

For production-ready logging:

# 🚀 Production log pipeline
import re
from typing import Dict, Any

class LogPipeline:
    def __init__(self):
        self.filters = []
        self.transformers = []
        self.destinations = []
    
    def add_filter(self, filter_func):
        # 🔍 Add log filters
        self.filters.append(filter_func)
    
    def add_transformer(self, transformer_func):
        # 🔄 Add log transformers
        self.transformers.append(transformer_func)
    
    def add_destination(self, destination):
        # 📍 Add log destinations
        self.destinations.append(destination)
    
    def process_log(self, log_data: Dict[str, Any]):
        # 🎯 Process log through pipeline
        
        # 1️⃣ Apply filters
        for filter_func in self.filters:
            if not filter_func(log_data):
                return  # 🚫 Log filtered out
        
        # 2️⃣ Apply transformations
        transformed = log_data
        for transformer in self.transformers:
            transformed = transformer(transformed)
        
        # 3️⃣ Send to destinations
        for destination in self.destinations:
            destination.send(transformed)

# 🛡️ Security filter
def security_filter(log_data):
    # 🔒 Remove sensitive data
    sensitive_patterns = [
        r'password=\S+',
        r'api_key=\S+',
        r'token=\S+',
        r'\b\d{16}\b'  # Credit card numbers
    ]
    
    message = log_data.get('message', '')
    for pattern in sensitive_patterns:
        message = re.sub(pattern, '[REDACTED]', message)
    
    log_data['message'] = message
    return True

# 📊 Metrics transformer
def metrics_transformer(log_data):
    # 📈 Add performance metrics
    if 'duration_ms' in log_data:
        log_data['performance_category'] = (
            'fast' if log_data['duration_ms'] < 100
            else 'normal' if log_data['duration_ms'] < 1000
            else 'slow'
        )
    return log_data

# 🎮 Set up pipeline
pipeline = LogPipeline()
pipeline.add_filter(security_filter)
pipeline.add_transformer(metrics_transformer)
pipeline.add_destination(ElasticsearchDestination())
pipeline.add_destination(S3BackupDestination())

⚠️ Common Pitfalls and Solutions

😱 Pitfall 1: Logging Sensitive Data

# ❌ Wrong way - logging passwords!
logger.info(f"User login attempt", 
           username=username, 
           password=password)  # 😰 Never log passwords!

# ✅ Correct way - log safely!
logger.info(f"User login attempt", 
           username=username,
           success=True)  # 🛡️ Log result, not credentials!

🤯 Pitfall 2: Blocking on Log Writes

# ❌ Dangerous - blocking I/O!
class BlockingLogger:
    def log(self, message):
        # 💥 This blocks the entire application!
        response = requests.post('http://logging-server', 
                               json={'message': message})

# ✅ Safe - async logging!
import asyncio
from concurrent.futures import ThreadPoolExecutor

class AsyncLogger:
    def __init__(self):
        self.executor = ThreadPoolExecutor(max_workers=5)
        self.queue = asyncio.Queue()
    
    async def log(self, message):
        # ✨ Non-blocking logging
        await self.queue.put(message)
    
    async def _process_logs(self):
        # 🔄 Background log processing
        while True:
            message = await self.queue.get()
            await asyncio.get_event_loop().run_in_executor(
                self.executor,
                self._send_log,
                message
            )

🛠️ Best Practices

🎯 Structure Your Logs: Use consistent field names across services
📝 Log at the Right Level: INFO for business events, ERROR for failures
🛡️ Never Log Sensitive Data: Passwords, tokens, PII must be filtered
🎨 Use Correlation IDs: Track requests across microservices
✨ Keep Logs Actionable: Include context for debugging
📊 Set Up Retention Policies: Don’t keep logs forever
🚀 Use Bulk Operations: Send logs in batches for performance

🧪 Hands-On Exercise

🎯 Challenge: Build a Complete Logging System

Create a production-ready logging system with these features:

📋 Requirements:

✅ Send logs to both Elasticsearch and file backup
🏷️ Add request correlation across services
👤 Include user context in all logs
📅 Implement log rotation and retention
🎨 Create custom Kibana dashboards

🚀 Bonus Points:

Add anomaly detection for error spikes
Implement log sampling for high-traffic endpoints
Create alerts for critical errors

💡 Solution

🔍 Click to see solution

# 🎯 Complete ELK logging solution!
import logging
import json
import os
from datetime import datetime, timedelta
from elasticsearch import Elasticsearch
from logging.handlers import RotatingFileHandler
import logstash
from contextlib import contextmanager
import threading

class ProductionLogger:
    def __init__(self, service_name, environment='production'):
        self.service_name = service_name
        self.environment = environment
        self.es = Elasticsearch(['localhost:9200'])
        self.logger = self._setup_logger()
        self._local = threading.local()
    
    def _setup_logger(self):
        # 📝 Create main logger
        logger = logging.getLogger(self.service_name)
        logger.setLevel(logging.INFO)
        
        # 1️⃣ Elasticsearch handler
        es_handler = ElasticsearchHandler(self.es, self.service_name)
        es_handler.setLevel(logging.INFO)
        
        # 2️⃣ File backup handler
        file_handler = RotatingFileHandler(
            f'logs/{self.service_name}.log',
            maxBytes=100*1024*1024,  # 100MB
            backupCount=10
        )
        file_handler.setLevel(logging.WARNING)
        
        # 3️⃣ Logstash handler
        logstash_handler = logstash.TCPLogstashHandler(
            host='localhost',
            port=5959
        )
        
        # 🎨 Add all handlers
        logger.addHandler(es_handler)
        logger.addHandler(file_handler)
        logger.addHandler(logstash_handler)
        
        return logger
    
    @contextmanager
    def correlation_context(self, correlation_id):
        # 🏷️ Set correlation ID for request
        old_id = getattr(self._local, 'correlation_id', None)
        self._local.correlation_id = correlation_id
        try:
            yield
        finally:
            self._local.correlation_id = old_id
    
    def _get_context(self):
        # 📊 Get current context
        return {
            'service': self.service_name,
            'environment': self.environment,
            'correlation_id': getattr(self._local, 'correlation_id', None),
            'timestamp': datetime.utcnow().isoformat(),
            'host': os.environ.get('HOSTNAME', 'unknown')
        }
    
    def info(self, message, **kwargs):
        # ✅ Log info with context
        context = {**self._get_context(), **kwargs}
        self.logger.info(message, extra={'context': context})
    
    def error(self, message, exception=None, **kwargs):
        # 🚨 Log error with alert
        context = {**self._get_context(), **kwargs}
        if exception:
            context['exception'] = {
                'type': type(exception).__name__,
                'message': str(exception)
            }
        
        self.logger.error(message, extra={'context': context})
        
        # 🔔 Send alert for critical errors
        if context.get('critical', False):
            self._send_alert(message, context)
    
    def _send_alert(self, message, context):
        # 🚨 Send alerts (implement your alerting)
        alert_data = {
            'service': self.service_name,
            'message': message,
            'context': context,
            'alert_time': datetime.utcnow().isoformat()
        }
        # Send to alerting service
    
    def create_dashboard(self):
        # 📊 Create Kibana dashboard config
        dashboard_config = {
            "version": "7.10.0",
            "objects": [
                {
                    "attributes": {
                        "title": f"{self.service_name} Dashboard",
                        "type": "dashboard",
                        "description": f"Monitoring dashboard for {self.service_name}"
                    },
                    "references": []
                }
            ]
        }
        
        # Save to Kibana
        return dashboard_config

# 🎮 Example usage
logger = ProductionLogger('payment-service')

# 🏷️ Use correlation context
with logger.correlation_context('req-123-456'):
    logger.info("Processing payment 💳", 
               user_id='user_789',
               amount=99.99,
               currency='USD')
    
    try:
        # Process payment...
        logger.info("Payment successful ✅",
                   transaction_id='txn_abc123')
    except Exception as e:
        logger.error("Payment failed 💥",
                    exception=e,
                    critical=True)

# 📊 Set up log retention
def cleanup_old_logs():
    # 🗑️ Delete logs older than 30 days
    cutoff_date = datetime.utcnow() - timedelta(days=30)
    logger.es.delete_by_query(
        index=f"{logger.service_name}-*",
        body={
            "query": {
                "range": {
                    "timestamp": {
                        "lt": cutoff_date.isoformat()
                    }
                }
            }
        }
    )

🎓 Key Takeaways

You’ve learned so much! Here’s what you can now do:

✅ Set up ELK Stack for Python applications 💪
✅ Send structured logs to Elasticsearch and Logstash 🛡️
✅ Create powerful dashboards in Kibana 🎯
✅ Implement correlation IDs for distributed tracing 🐛
✅ Build production-ready logging pipelines! 🚀

Remember: Good logging is like having a time machine for debugging - it lets you see exactly what happened! 🤝

🤝 Next Steps

Congratulations! 🎉 You’ve mastered ELK Stack logging with Python!

Here’s what to do next:

💻 Set up a local ELK Stack using Docker Compose
🏗️ Implement logging in your current project
📚 Move on to our next tutorial: [Monitoring: Prometheus and Grafana]
🌟 Create custom Kibana dashboards for your apps!

Remember: Every debugging session becomes easier with good logs. Keep logging, keep learning, and most importantly, have fun! 🚀

Happy coding! 🎉🚀✨

Prerequisites

What you'll learn