📘 Database Sharding: Horizontal Scaling

🎯 Introduction

Welcome to this exciting tutorial on database sharding! 🎉 Ever wondered how massive platforms like Instagram, Twitter, or Netflix handle billions of records without breaking a sweat? The secret is database sharding - a powerful technique for horizontal scaling!

You’ll discover how sharding can transform your database architecture from a single overwhelmed server to a distributed powerhouse. Whether you’re building social networks 🌐, e-commerce platforms 🛒, or analytics systems 📊, understanding sharding is essential for scaling your applications to millions of users.

By the end of this tutorial, you’ll feel confident implementing sharding strategies in your own projects! Let’s dive in! 🏊‍♂️

📚 Understanding Database Sharding

🤔 What is Database Sharding?

Database sharding is like splitting a huge library into multiple smaller libraries 📚. Instead of having one massive building that gets crowded, you create several specialized branches - each handling a portion of the books!

In Python terms, sharding means distributing your data across multiple database servers (shards) based on a sharding key. This means you can:

✨ Scale horizontally by adding more servers
🚀 Improve query performance by reducing data per server
🛡️ Increase availability with distributed architecture

💡 Why Use Database Sharding?

Here’s why developers love sharding:

Infinite Scalability 🚀: Add more shards as your data grows
Better Performance ⚡: Queries run faster on smaller datasets
Fault Isolation 🛡️: One shard failure doesn’t affect others
Cost Efficiency 💰: Use commodity hardware instead of supercomputers

Real-world example: Imagine building a social media platform 📱. With sharding, you can distribute users across multiple databases based on their location or user ID, ensuring fast response times globally!

🔧 Basic Syntax and Usage

📝 Simple Example

Let’s start with a friendly example of a basic sharding implementation:

# 👋 Hello, Sharding!
import hashlib
from typing import Dict, List, Any

class DatabaseShard:
    """🎨 Represents a single database shard"""
    def __init__(self, shard_id: str):
        self.shard_id = shard_id
        self.data: Dict[str, Any] = {}  # 📦 Simple in-memory storage
    
    def insert(self, key: str, value: Any) -> None:
        """✨ Insert data into this shard"""
        self.data[key] = value
        print(f"💾 Inserted {key} into shard {self.shard_id}")
    
    def get(self, key: str) -> Any:
        """🔍 Retrieve data from this shard"""
        return self.data.get(key)

class ShardManager:
    """🎯 Manages multiple database shards"""
    def __init__(self, num_shards: int):
        self.shards = [
            DatabaseShard(f"shard_{i}") 
            for i in range(num_shards)
        ]
        print(f"🚀 Created {num_shards} shards!")
    
    def get_shard(self, key: str) -> DatabaseShard:
        """🎲 Determine which shard holds this key"""
        # Using consistent hashing 🔄
        hash_value = int(hashlib.md5(key.encode()).hexdigest(), 16)
        shard_index = hash_value % len(self.shards)
        return self.shards[shard_index]
    
    def insert(self, key: str, value: Any) -> None:
        """➕ Insert data into appropriate shard"""
        shard = self.get_shard(key)
        shard.insert(key, value)
    
    def get(self, key: str) -> Any:
        """🔍 Get data from appropriate shard"""
        shard = self.get_shard(key)
        return shard.get(key)

# 🎮 Let's use it!
manager = ShardManager(3)
manager.insert("user_123", {"name": "Alice", "emoji": "👩‍💻"})
manager.insert("user_456", {"name": "Bob", "emoji": "👨‍💼"})

💡 Explanation: Notice how we use consistent hashing to determine which shard stores each piece of data. The hash function ensures even distribution across shards!

🎯 Common Patterns

Here are sharding patterns you’ll use in production:

# 🏗️ Pattern 1: Range-based sharding
class RangeShardManager:
    """📊 Shards data based on ranges"""
    def __init__(self):
        self.shards = {
            "A-H": DatabaseShard("shard_1"),  # 🅰️ Names A-H
            "I-P": DatabaseShard("shard_2"),  # 🅱️ Names I-P
            "Q-Z": DatabaseShard("shard_3")   # 🅾️ Names Q-Z
        }
    
    def get_shard_by_name(self, name: str) -> DatabaseShard:
        first_letter = name[0].upper()
        if "A" <= first_letter <= "H":
            return self.shards["A-H"]
        elif "I" <= first_letter <= "P":
            return self.shards["I-P"]
        else:
            return self.shards["Q-Z"]

# 🎨 Pattern 2: Geographic sharding
class GeoShardManager:
    """🌍 Shards data by geographic location"""
    def __init__(self):
        self.region_shards = {
            "US": DatabaseShard("us_shard"),    # 🇺🇸
            "EU": DatabaseShard("eu_shard"),    # 🇪🇺
            "ASIA": DatabaseShard("asia_shard") # 🌏
        }
    
    def get_shard_by_region(self, region: str) -> DatabaseShard:
        return self.region_shards.get(region, self.region_shards["US"])

# 🔄 Pattern 3: Time-based sharding
from datetime import datetime

class TimeShardManager:
    """📅 Shards data by time periods"""
    def __init__(self):
        self.year_shards: Dict[int, DatabaseShard] = {}
    
    def get_shard_by_date(self, date: datetime) -> DatabaseShard:
        year = date.year
        if year not in self.year_shards:
            self.year_shards[year] = DatabaseShard(f"shard_{year}")
        return self.year_shards[year]

💡 Practical Examples

🛒 Example 1: E-commerce Order System

Let’s build a sharded order management system:

# 🛍️ E-commerce order sharding system
import json
from datetime import datetime
from typing import Optional

class Order:
    """📦 Represents an order"""
    def __init__(self, order_id: str, user_id: str, items: List[Dict], total: float):
        self.order_id = order_id
        self.user_id = user_id
        self.items = items
        self.total = total
        self.created_at = datetime.now()
        self.status = "pending"  # 📋 Order status
        self.emoji = "🛒"

class OrderShardSystem:
    """🏪 Sharded order management system"""
    def __init__(self, num_shards: int = 4):
        self.shards = [DatabaseShard(f"order_shard_{i}") for i in range(num_shards)]
        self.user_shard_map = {}  # 🗺️ Cache user-to-shard mapping
        print(f"🚀 Order system initialized with {num_shards} shards!")
    
    def _get_user_shard(self, user_id: str) -> DatabaseShard:
        """🎯 Get shard for user (sticky sharding)"""
        if user_id not in self.user_shard_map:
            # Assign user to shard based on hash
            hash_value = int(hashlib.md5(user_id.encode()).hexdigest(), 16)
            shard_index = hash_value % len(self.shards)
            self.user_shard_map[user_id] = shard_index
        
        return self.shards[self.user_shard_map[user_id]]
    
    def create_order(self, order: Order) -> None:
        """🛍️ Create new order in appropriate shard"""
        shard = self._get_user_shard(order.user_id)
        order_data = {
            "order_id": order.order_id,
            "user_id": order.user_id,
            "items": order.items,
            "total": order.total,
            "created_at": order.created_at.isoformat(),
            "status": order.status
        }
        shard.insert(order.order_id, order_data)
        print(f"✅ Order {order.order_id} created for user {order.user_id}!")
    
    def get_user_orders(self, user_id: str) -> List[Dict]:
        """📋 Get all orders for a user (efficient!)"""
        shard = self._get_user_shard(user_id)
        user_orders = []
        
        # All user orders are in the same shard! 🎯
        for key, order in shard.data.items():
            if order.get("user_id") == user_id:
                user_orders.append(order)
        
        return sorted(user_orders, key=lambda x: x["created_at"], reverse=True)
    
    def update_order_status(self, order_id: str, user_id: str, status: str) -> None:
        """🔄 Update order status"""
        shard = self._get_user_shard(user_id)
        order = shard.get(order_id)
        if order:
            order["status"] = status
            shard.insert(order_id, order)
            print(f"📦 Order {order_id} updated to {status}!")

# 🎮 Let's use it!
order_system = OrderShardSystem(num_shards=3)

# Create some orders
order1 = Order("ORD-001", "USER-123", 
              [{"item": "Python Book", "price": 29.99, "emoji": "📘"}], 
              29.99)
order2 = Order("ORD-002", "USER-123", 
              [{"item": "Coffee Mug", "price": 12.99, "emoji": "☕"}], 
              12.99)
order3 = Order("ORD-003", "USER-456", 
              [{"item": "Keyboard", "price": 89.99, "emoji": "⌨️"}], 
              89.99)

order_system.create_order(order1)
order_system.create_order(order2)
order_system.create_order(order3)

# Get user orders (fast because they're all in one shard!)
user_orders = order_system.get_user_orders("USER-123")
print(f"\n📋 Found {len(user_orders)} orders for USER-123")

🎯 Try it yourself: Add a method to calculate total revenue per shard and implement cross-shard analytics!

🎮 Example 2: Gaming Leaderboard System

Let’s make a sharded leaderboard for a multiplayer game:

# 🏆 Sharded gaming leaderboard system
import heapq
from collections import defaultdict

class Player:
    """🎮 Represents a game player"""
    def __init__(self, player_id: str, username: str, region: str):
        self.player_id = player_id
        self.username = username
        self.region = region
        self.score = 0
        self.level = 1
        self.achievements = []
        self.emoji = "🎮"

class LeaderboardShardSystem:
    """🏅 Sharded leaderboard for global gaming"""
    def __init__(self):
        # Geographic sharding for low latency! 🌍
        self.region_shards = {
            "NA": DatabaseShard("north_america"),    # 🌎
            "EU": DatabaseShard("europe"),           # 🌍
            "ASIA": DatabaseShard("asia"),           # 🌏
            "SA": DatabaseShard("south_america")     # 🌎
        }
        
        # Score buckets for efficient ranking 📊
        self.score_buckets = defaultdict(list)
        print("🚀 Global leaderboard system initialized!")
    
    def add_player(self, player: Player) -> None:
        """➕ Add new player to regional shard"""
        shard = self.region_shards.get(player.region, self.region_shards["NA"])
        player_data = {
            "player_id": player.player_id,
            "username": player.username,
            "score": player.score,
            "level": player.level,
            "achievements": player.achievements,
            "region": player.region
        }
        shard.insert(player.player_id, player_data)
        print(f"🎯 Player {player.username} joined {player.region} region!")
    
    def update_score(self, player_id: str, region: str, points: int) -> None:
        """🎯 Update player score"""
        shard = self.region_shards.get(region)
        player_data = shard.get(player_id)
        
        if player_data:
            old_score = player_data["score"]
            player_data["score"] += points
            
            # Level up every 1000 points! 🎊
            new_level = (player_data["score"] // 1000) + 1
            if new_level > player_data["level"]:
                player_data["level"] = new_level
                player_data["achievements"].append(f"🏆 Level {new_level} Master")
                print(f"🎉 {player_data['username']} leveled up to {new_level}!")
            
            shard.insert(player_id, player_data)
            self._update_score_bucket(player_id, old_score, player_data["score"])
            print(f"✨ {player_data['username']} earned {points} points!")
    
    def _update_score_bucket(self, player_id: str, old_score: int, new_score: int):
        """📊 Update score buckets for efficient ranking"""
        old_bucket = old_score // 1000
        new_bucket = new_score // 1000
        
        if old_bucket != new_bucket:
            if player_id in self.score_buckets[old_bucket]:
                self.score_buckets[old_bucket].remove(player_id)
            self.score_buckets[new_bucket].append(player_id)
    
    def get_regional_leaderboard(self, region: str, top_n: int = 10) -> List[Dict]:
        """🏅 Get top players in a region"""
        shard = self.region_shards.get(region)
        if not shard:
            return []
        
        # Use heap for efficient top-N 🎯
        players = []
        for player_id, player_data in shard.data.items():
            heapq.heappush(players, (-player_data["score"], player_data))
        
        # Get top N players
        top_players = []
        for _ in range(min(top_n, len(players))):
            if players:
                score, player = heapq.heappop(players)
                top_players.append(player)
        
        return top_players
    
    def get_global_leaderboard(self, top_n: int = 10) -> List[Dict]:
        """🌍 Get global top players (cross-shard query)"""
        all_players = []
        
        # Collect top players from each shard 🌐
        for region, shard in self.region_shards.items():
            regional_top = self.get_regional_leaderboard(region, top_n)
            all_players.extend(regional_top)
        
        # Sort globally and return top N
        all_players.sort(key=lambda x: x["score"], reverse=True)
        return all_players[:top_n]

# 🎮 Let's play!
leaderboard = LeaderboardShardSystem()

# Add players from different regions
players = [
    Player("P001", "DragonSlayer", "NA"),
    Player("P002", "NinjaWarrior", "ASIA"),
    Player("P003", "VikingKing", "EU"),
    Player("P004", "AztecEagle", "SA")
]

for player in players:
    leaderboard.add_player(player)

# Simulate gameplay
leaderboard.update_score("P001", "NA", 1500)
leaderboard.update_score("P002", "ASIA", 2000)
leaderboard.update_score("P003", "EU", 1800)
leaderboard.update_score("P004", "SA", 900)

# Get leaderboards
print("\n🏅 North America Leaderboard:")
na_leaders = leaderboard.get_regional_leaderboard("NA", 5)
for i, player in enumerate(na_leaders, 1):
    print(f"  {i}. {player['username']} - {player['score']} points")

print("\n🌍 Global Leaderboard:")
global_leaders = leaderboard.get_global_leaderboard(5)
for i, player in enumerate(global_leaders, 1):
    print(f"  {i}. {player['username']} - {player['score']} points")

🚀 Advanced Concepts

🧙‍♂️ Advanced Topic 1: Consistent Hashing with Virtual Nodes

When you’re ready to level up, implement advanced consistent hashing:

# 🎯 Advanced consistent hashing with virtual nodes
import bisect
from hashlib import md5

class ConsistentHashRing:
    """💍 Consistent hash ring for better distribution"""
    def __init__(self, nodes: List[str], virtual_nodes: int = 150):
        self.nodes = nodes
        self.virtual_nodes = virtual_nodes
        self.ring = {}
        self.sorted_keys = []
        self._build_ring()
        print(f"✨ Built hash ring with {len(nodes)} nodes and {virtual_nodes} virtual nodes each!")
    
    def _hash(self, key: str) -> int:
        """🔐 Generate hash value"""
        return int(md5(key.encode()).hexdigest(), 16)
    
    def _build_ring(self):
        """🏗️ Build the hash ring with virtual nodes"""
        for node in self.nodes:
            for i in range(self.virtual_nodes):
                virtual_key = f"{node}:{i}"
                hash_value = self._hash(virtual_key)
                self.ring[hash_value] = node
                bisect.insort(self.sorted_keys, hash_value)
    
    def get_node(self, key: str) -> str:
        """🎯 Find node responsible for key"""
        if not self.ring:
            return None
        
        hash_value = self._hash(key)
        index = bisect.bisect_right(self.sorted_keys, hash_value)
        
        # Wrap around to first node if needed 🔄
        if index == len(self.sorted_keys):
            index = 0
        
        return self.ring[self.sorted_keys[index]]
    
    def add_node(self, node: str):
        """➕ Add new node to ring (for scaling!)"""
        self.nodes.append(node)
        for i in range(self.virtual_nodes):
            virtual_key = f"{node}:{i}"
            hash_value = self._hash(virtual_key)
            self.ring[hash_value] = node
            bisect.insort(self.sorted_keys, hash_value)
        print(f"🚀 Added node {node} to the ring!")
    
    def remove_node(self, node: str):
        """➖ Remove node from ring"""
        self.nodes.remove(node)
        for i in range(self.virtual_nodes):
            virtual_key = f"{node}:{i}"
            hash_value = self._hash(virtual_key)
            del self.ring[hash_value]
            self.sorted_keys.remove(hash_value)
        print(f"👋 Removed node {node} from the ring!")

# 🪄 Using the consistent hash ring
shard_nodes = ["shard_1", "shard_2", "shard_3"]
hash_ring = ConsistentHashRing(shard_nodes)

# Test distribution
test_keys = [f"user_{i}" for i in range(100)]
distribution = defaultdict(int)

for key in test_keys:
    node = hash_ring.get_node(key)
    distribution[node] += 1

print("\n📊 Key distribution:")
for node, count in distribution.items():
    print(f"  {node}: {count} keys ({count/len(test_keys)*100:.1f}%)")

🏗️ Advanced Topic 2: Cross-Shard Queries and Aggregation

For complex queries across shards:

# 🚀 Cross-shard query engine
import asyncio
from concurrent.futures import ThreadPoolExecutor
from typing import Callable, Any

class ShardQueryEngine:
    """🔍 Execute queries across multiple shards"""
    def __init__(self, shards: List[DatabaseShard]):
        self.shards = shards
        self.executor = ThreadPoolExecutor(max_workers=len(shards))
        print(f"🚀 Query engine initialized for {len(shards)} shards!")
    
    def map_reduce(self, 
                   map_func: Callable[[DatabaseShard], Any],
                   reduce_func: Callable[[List[Any]], Any]) -> Any:
        """🗺️ Map-reduce pattern for cross-shard queries"""
        # Map phase - parallel execution! ⚡
        futures = []
        for shard in self.shards:
            future = self.executor.submit(map_func, shard)
            futures.append(future)
        
        # Collect results
        results = []
        for future in futures:
            results.append(future.result())
        
        # Reduce phase 📊
        return reduce_func(results)
    
    def aggregate_sum(self, field: str) -> float:
        """➕ Sum a field across all shards"""
        def map_sum(shard: DatabaseShard) -> float:
            total = 0
            for key, value in shard.data.items():
                if isinstance(value, dict) and field in value:
                    total += value[field]
            return total
        
        def reduce_sum(totals: List[float]) -> float:
            return sum(totals)
        
        return self.map_reduce(map_sum, reduce_sum)
    
    def search_all_shards(self, condition: Callable[[Any], bool]) -> List[Any]:
        """🔍 Search across all shards with condition"""
        def map_search(shard: DatabaseShard) -> List[Any]:
            results = []
            for key, value in shard.data.items():
                if condition(value):
                    results.append(value)
            return results
        
        def reduce_search(shard_results: List[List[Any]]) -> List[Any]:
            all_results = []
            for results in shard_results:
                all_results.extend(results)
            return all_results
        
        return self.map_reduce(map_search, reduce_search)

# 🎮 Example usage
# Create sample shards with data
shards = [DatabaseShard(f"shard_{i}") for i in range(3)]

# Add sample data
for i in range(30):
    shard_idx = i % 3
    shards[shard_idx].insert(f"order_{i}", {
        "order_id": f"order_{i}",
        "amount": 10 + (i * 5),
        "status": "completed" if i % 2 == 0 else "pending"
    })

# Create query engine
query_engine = ShardQueryEngine(shards)

# Calculate total revenue across all shards! 💰
total_revenue = query_engine.aggregate_sum("amount")
print(f"\n💰 Total revenue across all shards: ${total_revenue}")

# Find all completed orders
completed_orders = query_engine.search_all_shards(
    lambda order: isinstance(order, dict) and order.get("status") == "completed"
)
print(f"✅ Found {len(completed_orders)} completed orders across all shards")

⚠️ Common Pitfalls and Solutions

😱 Pitfall 1: Hot Shards

# ❌ Wrong way - creating hot shards!
class BadSharding:
    def get_shard(self, user_id: str) -> int:
        # Celebrity users all end up on shard 0! 💥
        if user_id in ["celebrity1", "celebrity2", "celebrity3"]:
            return 0
        return hash(user_id) % 3

# ✅ Correct way - even distribution!
class GoodSharding:
    def __init__(self):
        self.virtual_shards = 100  # More virtual shards
        self.physical_shards = 3
    
    def get_shard(self, user_id: str) -> int:
        # Use consistent hashing for even distribution 🎯
        virtual_shard = hash(user_id) % self.virtual_shards
        return virtual_shard % self.physical_shards

🤯 Pitfall 2: Cross-Shard Joins

# ❌ Dangerous - inefficient cross-shard joins!
def get_user_with_orders_bad(user_id: str, user_shard: DatabaseShard, order_shard: DatabaseShard):
    user = user_shard.get(user_id)  # User in one shard
    orders = []
    # Scanning entire order shard! 💥
    for key, order in order_shard.data.items():
        if order.get("user_id") == user_id:
            orders.append(order)
    return {"user": user, "orders": orders}

# ✅ Better - keep related data together!
class UserOrderShard:
    """🎯 Keep user and their orders in same shard"""
    def __init__(self):
        self.data = {}
    
    def add_user_with_orders(self, user_id: str, user_data: dict, orders: list):
        self.data[user_id] = {
            "user": user_data,
            "orders": orders  # All user orders in same shard! ✨
        }
    
    def get_user_with_orders(self, user_id: str):
        return self.data.get(user_id, {})

🛠️ Best Practices

🎯 Choose the Right Sharding Key: Use keys that distribute data evenly
📊 Monitor Shard Balance: Track data distribution and rebalance when needed
🛡️ Plan for Resharding: Design your system to handle shard count changes
🔄 Use Consistent Hashing: Minimize data movement when adding/removing shards
💾 Keep Related Data Together: Avoid cross-shard joins by smart data placement

🧪 Hands-On Exercise

Create a sharded social media platform:

📋 Requirements:

✅ User profiles distributed across shards
🏷️ Posts stored with their authors (same shard)
👥 Friend relationships with bidirectional lookups
📊 Analytics for post engagement
🎨 Each user needs a profile emoji!

🚀 Bonus Points:

Add timeline generation across shards
Implement hashtag trending analysis
Create a recommendation engine

💡 Solution

🔍 Click to see solution

# 🎯 Sharded social media system!
from datetime import datetime
import uuid

class SocialMediaShard:
    """📱 Single shard for social media data"""
    def __init__(self, shard_id: str):
        self.shard_id = shard_id
        self.users = {}
        self.posts = {}
        self.friendships = defaultdict(set)

class ShardedSocialMedia:
    """🌐 Distributed social media platform"""
    def __init__(self, num_shards: int = 4):
        self.shards = [SocialMediaShard(f"social_shard_{i}") for i in range(num_shards)]
        self.hash_ring = ConsistentHashRing([s.shard_id for s in self.shards])
        print(f"🚀 Social media platform initialized with {num_shards} shards!")
    
    def _get_shard_for_user(self, user_id: str) -> SocialMediaShard:
        """🎯 Get shard for user using consistent hashing"""
        shard_id = self.hash_ring.get_node(user_id)
        for shard in self.shards:
            if shard.shard_id == shard_id:
                return shard
        return self.shards[0]
    
    def create_user(self, user_id: str, username: str, emoji: str) -> None:
        """👤 Create new user profile"""
        shard = self._get_shard_for_user(user_id)
        shard.users[user_id] = {
            "user_id": user_id,
            "username": username,
            "emoji": emoji,
            "created_at": datetime.now().isoformat(),
            "post_count": 0,
            "friend_count": 0
        }
        print(f"✅ User {emoji} {username} created!")
    
    def create_post(self, user_id: str, content: str) -> str:
        """📝 Create new post (stored with user)"""
        shard = self._get_shard_for_user(user_id)
        post_id = str(uuid.uuid4())
        
        shard.posts[post_id] = {
            "post_id": post_id,
            "user_id": user_id,
            "content": content,
            "created_at": datetime.now().isoformat(),
            "likes": 0,
            "comments": []
        }
        
        # Update user post count
        if user_id in shard.users:
            shard.users[user_id]["post_count"] += 1
        
        print(f"📮 Post created by user {user_id}!")
        return post_id
    
    def add_friend(self, user_id: str, friend_id: str) -> None:
        """👥 Add bidirectional friendship"""
        # Store in both users' shards for fast lookup
        user_shard = self._get_shard_for_user(user_id)
        friend_shard = self._get_shard_for_user(friend_id)
        
        user_shard.friendships[user_id].add(friend_id)
        friend_shard.friendships[friend_id].add(user_id)
        
        # Update friend counts
        if user_id in user_shard.users:
            user_shard.users[user_id]["friend_count"] += 1
        if friend_id in friend_shard.users:
            friend_shard.users[friend_id]["friend_count"] += 1
        
        print(f"🤝 {user_id} and {friend_id} are now friends!")
    
    def get_user_timeline(self, user_id: str, limit: int = 10) -> List[Dict]:
        """📋 Get user's timeline (own posts + friends' posts)"""
        user_shard = self._get_shard_for_user(user_id)
        timeline_posts = []
        
        # Get user's own posts
        for post_id, post in user_shard.posts.items():
            if post["user_id"] == user_id:
                timeline_posts.append(post)
        
        # Get friends' posts (may require cross-shard queries)
        friends = user_shard.friendships.get(user_id, set())
        for friend_id in friends:
            friend_shard = self._get_shard_for_user(friend_id)
            for post_id, post in friend_shard.posts.items():
                if post["user_id"] == friend_id:
                    timeline_posts.append(post)
        
        # Sort by timestamp and return latest
        timeline_posts.sort(key=lambda x: x["created_at"], reverse=True)
        return timeline_posts[:limit]
    
    def get_trending_stats(self) -> Dict:
        """📊 Get platform-wide statistics"""
        total_users = 0
        total_posts = 0
        total_friendships = 0
        
        for shard in self.shards:
            total_users += len(shard.users)
            total_posts += len(shard.posts)
            total_friendships += sum(len(friends) for friends in shard.friendships.values())
        
        return {
            "total_users": total_users,
            "total_posts": total_posts,
            "total_friendships": total_friendships // 2,  # Bidirectional
            "avg_posts_per_user": total_posts / max(total_users, 1)
        }

# 🎮 Test the system!
social_media = ShardedSocialMedia(num_shards=3)

# Create users
users = [
    ("user_001", "Alice", "👩‍💻"),
    ("user_002", "Bob", "👨‍💼"),
    ("user_003", "Charlie", "👨‍🎨"),
    ("user_004", "Diana", "👩‍🔬")
]

for user_id, username, emoji in users:
    social_media.create_user(user_id, username, emoji)

# Create friendships
social_media.add_friend("user_001", "user_002")
social_media.add_friend("user_001", "user_003")
social_media.add_friend("user_002", "user_004")

# Create posts
social_media.create_post("user_001", "Hello sharded world! 🌍")
social_media.create_post("user_002", "Database sharding is awesome! 🚀")
social_media.create_post("user_003", "Learning Python every day! 🐍")

# Get timeline
print("\n📋 Alice's Timeline:")
timeline = social_media.get_user_timeline("user_001")
for post in timeline:
    print(f"  - {post['content']} (by {post['user_id']})")

# Get stats
stats = social_media.get_trending_stats()
print(f"\n📊 Platform Statistics:")
print(f"  Users: {stats['total_users']}")
print(f"  Posts: {stats['total_posts']}")
print(f"  Friendships: {stats['total_friendships']}")

🎓 Key Takeaways

You’ve learned so much! Here’s what you can now do:

✅ Implement database sharding with confidence 💪
✅ Choose appropriate sharding strategies for your use case 🛡️
✅ Build scalable distributed systems in Python 🎯
✅ Handle cross-shard queries efficiently 🐛
✅ Design for horizontal scaling from the start! 🚀

Remember: Sharding is powerful but comes with complexity. Start simple and shard when you need to scale! 🤝

🤝 Next Steps

Congratulations! 🎉 You’ve mastered database sharding!

Here’s what to do next:

💻 Practice with the social media exercise above
🏗️ Build a sharded analytics system for your projects
📚 Learn about shard rebalancing and migration strategies
🌟 Explore real-world sharding in MongoDB or Cassandra!

Remember: Every scalable system started with understanding sharding. Keep building, keep scaling, and most importantly, have fun! 🚀

Happy sharding! 🎉🚀✨

Prerequisites

What you'll learn

🎯 Introduction

📚 Understanding Database Sharding

🤔 What is Database Sharding?

💡 Why Use Database Sharding?

🔧 Basic Syntax and Usage

📝 Simple Example

🎯 Common Patterns

💡 Practical Examples

🛒 Example 1: E-commerce Order System

🎮 Example 2: Gaming Leaderboard System

🚀 Advanced Concepts

🧙‍♂️ Advanced Topic 1: Consistent Hashing with Virtual Nodes

🏗️ Advanced Topic 2: Cross-Shard Queries and Aggregation

⚠️ Common Pitfalls and Solutions

😱 Pitfall 1: Hot Shards

🤯 Pitfall 2: Cross-Shard Joins

🛠️ Best Practices

🧪 Hands-On Exercise

💡 Solution

🎓 Key Takeaways

🤝 Next Steps

More python Tutorials

📘 Time Series Databases: InfluxDB

📘 Database Sharding: Horizontal Scaling

📘 Database Replication: Master-Slave

Tutorial Info

📘 Database Sharding: Horizontal Scaling

Prerequisites

What you'll learn

🎯 Introduction

📚 Understanding Database Sharding

🤔 What is Database Sharding?

💡 Why Use Database Sharding?

🔧 Basic Syntax and Usage

📝 Simple Example

🎯 Common Patterns

💡 Practical Examples

🛒 Example 1: E-commerce Order System

🎮 Example 2: Gaming Leaderboard System

🚀 Advanced Concepts

🧙‍♂️ Advanced Topic 1: Consistent Hashing with Virtual Nodes

🏗️ Advanced Topic 2: Cross-Shard Queries and Aggregation

⚠️ Common Pitfalls and Solutions

😱 Pitfall 1: Hot Shards

🤯 Pitfall 2: Cross-Shard Joins

🛠️ Best Practices

🧪 Hands-On Exercise

🎯 Challenge: Build a Sharded Social Media System

💡 Solution

🎓 Key Takeaways

🤝 Next Steps

More python Tutorials

📘 Time Series Databases: InfluxDB

📘 Database Sharding: Horizontal Scaling

📘 Database Replication: Master-Slave

Tutorial Info