Prerequisites
- Basic understanding of programming concepts ๐
- Python installation (3.8+) ๐
- VS Code or preferred IDE ๐ป
What you'll learn
- Understand the concept fundamentals ๐ฏ
- Apply the concept in real projects ๐๏ธ
- Debug common issues ๐
- Write clean, Pythonic code โจ
๐ฏ Introduction
Welcome to this exciting tutorial on database sharding! ๐ Ever wondered how massive platforms like Instagram, Twitter, or Netflix handle billions of records without breaking a sweat? The secret is database sharding - a powerful technique for horizontal scaling!
Youโll discover how sharding can transform your database architecture from a single overwhelmed server to a distributed powerhouse. Whether youโre building social networks ๐, e-commerce platforms ๐, or analytics systems ๐, understanding sharding is essential for scaling your applications to millions of users.
By the end of this tutorial, youโll feel confident implementing sharding strategies in your own projects! Letโs dive in! ๐โโ๏ธ
๐ Understanding Database Sharding
๐ค What is Database Sharding?
Database sharding is like splitting a huge library into multiple smaller libraries ๐. Instead of having one massive building that gets crowded, you create several specialized branches - each handling a portion of the books!
In Python terms, sharding means distributing your data across multiple database servers (shards) based on a sharding key. This means you can:
- โจ Scale horizontally by adding more servers
- ๐ Improve query performance by reducing data per server
- ๐ก๏ธ Increase availability with distributed architecture
๐ก Why Use Database Sharding?
Hereโs why developers love sharding:
- Infinite Scalability ๐: Add more shards as your data grows
- Better Performance โก: Queries run faster on smaller datasets
- Fault Isolation ๐ก๏ธ: One shard failure doesnโt affect others
- Cost Efficiency ๐ฐ: Use commodity hardware instead of supercomputers
Real-world example: Imagine building a social media platform ๐ฑ. With sharding, you can distribute users across multiple databases based on their location or user ID, ensuring fast response times globally!
๐ง Basic Syntax and Usage
๐ Simple Example
Letโs start with a friendly example of a basic sharding implementation:
# ๐ Hello, Sharding!
import hashlib
from typing import Dict, List, Any
class DatabaseShard:
"""๐จ Represents a single database shard"""
def __init__(self, shard_id: str):
self.shard_id = shard_id
self.data: Dict[str, Any] = {} # ๐ฆ Simple in-memory storage
def insert(self, key: str, value: Any) -> None:
"""โจ Insert data into this shard"""
self.data[key] = value
print(f"๐พ Inserted {key} into shard {self.shard_id}")
def get(self, key: str) -> Any:
"""๐ Retrieve data from this shard"""
return self.data.get(key)
class ShardManager:
"""๐ฏ Manages multiple database shards"""
def __init__(self, num_shards: int):
self.shards = [
DatabaseShard(f"shard_{i}")
for i in range(num_shards)
]
print(f"๐ Created {num_shards} shards!")
def get_shard(self, key: str) -> DatabaseShard:
"""๐ฒ Determine which shard holds this key"""
# Using consistent hashing ๐
hash_value = int(hashlib.md5(key.encode()).hexdigest(), 16)
shard_index = hash_value % len(self.shards)
return self.shards[shard_index]
def insert(self, key: str, value: Any) -> None:
"""โ Insert data into appropriate shard"""
shard = self.get_shard(key)
shard.insert(key, value)
def get(self, key: str) -> Any:
"""๐ Get data from appropriate shard"""
shard = self.get_shard(key)
return shard.get(key)
# ๐ฎ Let's use it!
manager = ShardManager(3)
manager.insert("user_123", {"name": "Alice", "emoji": "๐ฉโ๐ป"})
manager.insert("user_456", {"name": "Bob", "emoji": "๐จโ๐ผ"})
๐ก Explanation: Notice how we use consistent hashing to determine which shard stores each piece of data. The hash function ensures even distribution across shards!
๐ฏ Common Patterns
Here are sharding patterns youโll use in production:
# ๐๏ธ Pattern 1: Range-based sharding
class RangeShardManager:
"""๐ Shards data based on ranges"""
def __init__(self):
self.shards = {
"A-H": DatabaseShard("shard_1"), # ๐
ฐ๏ธ Names A-H
"I-P": DatabaseShard("shard_2"), # ๐
ฑ๏ธ Names I-P
"Q-Z": DatabaseShard("shard_3") # ๐
พ๏ธ Names Q-Z
}
def get_shard_by_name(self, name: str) -> DatabaseShard:
first_letter = name[0].upper()
if "A" <= first_letter <= "H":
return self.shards["A-H"]
elif "I" <= first_letter <= "P":
return self.shards["I-P"]
else:
return self.shards["Q-Z"]
# ๐จ Pattern 2: Geographic sharding
class GeoShardManager:
"""๐ Shards data by geographic location"""
def __init__(self):
self.region_shards = {
"US": DatabaseShard("us_shard"), # ๐บ๐ธ
"EU": DatabaseShard("eu_shard"), # ๐ช๐บ
"ASIA": DatabaseShard("asia_shard") # ๐
}
def get_shard_by_region(self, region: str) -> DatabaseShard:
return self.region_shards.get(region, self.region_shards["US"])
# ๐ Pattern 3: Time-based sharding
from datetime import datetime
class TimeShardManager:
"""๐
Shards data by time periods"""
def __init__(self):
self.year_shards: Dict[int, DatabaseShard] = {}
def get_shard_by_date(self, date: datetime) -> DatabaseShard:
year = date.year
if year not in self.year_shards:
self.year_shards[year] = DatabaseShard(f"shard_{year}")
return self.year_shards[year]
๐ก Practical Examples
๐ Example 1: E-commerce Order System
Letโs build a sharded order management system:
# ๐๏ธ E-commerce order sharding system
import json
from datetime import datetime
from typing import Optional
class Order:
"""๐ฆ Represents an order"""
def __init__(self, order_id: str, user_id: str, items: List[Dict], total: float):
self.order_id = order_id
self.user_id = user_id
self.items = items
self.total = total
self.created_at = datetime.now()
self.status = "pending" # ๐ Order status
self.emoji = "๐"
class OrderShardSystem:
"""๐ช Sharded order management system"""
def __init__(self, num_shards: int = 4):
self.shards = [DatabaseShard(f"order_shard_{i}") for i in range(num_shards)]
self.user_shard_map = {} # ๐บ๏ธ Cache user-to-shard mapping
print(f"๐ Order system initialized with {num_shards} shards!")
def _get_user_shard(self, user_id: str) -> DatabaseShard:
"""๐ฏ Get shard for user (sticky sharding)"""
if user_id not in self.user_shard_map:
# Assign user to shard based on hash
hash_value = int(hashlib.md5(user_id.encode()).hexdigest(), 16)
shard_index = hash_value % len(self.shards)
self.user_shard_map[user_id] = shard_index
return self.shards[self.user_shard_map[user_id]]
def create_order(self, order: Order) -> None:
"""๐๏ธ Create new order in appropriate shard"""
shard = self._get_user_shard(order.user_id)
order_data = {
"order_id": order.order_id,
"user_id": order.user_id,
"items": order.items,
"total": order.total,
"created_at": order.created_at.isoformat(),
"status": order.status
}
shard.insert(order.order_id, order_data)
print(f"โ
Order {order.order_id} created for user {order.user_id}!")
def get_user_orders(self, user_id: str) -> List[Dict]:
"""๐ Get all orders for a user (efficient!)"""
shard = self._get_user_shard(user_id)
user_orders = []
# All user orders are in the same shard! ๐ฏ
for key, order in shard.data.items():
if order.get("user_id") == user_id:
user_orders.append(order)
return sorted(user_orders, key=lambda x: x["created_at"], reverse=True)
def update_order_status(self, order_id: str, user_id: str, status: str) -> None:
"""๐ Update order status"""
shard = self._get_user_shard(user_id)
order = shard.get(order_id)
if order:
order["status"] = status
shard.insert(order_id, order)
print(f"๐ฆ Order {order_id} updated to {status}!")
# ๐ฎ Let's use it!
order_system = OrderShardSystem(num_shards=3)
# Create some orders
order1 = Order("ORD-001", "USER-123",
[{"item": "Python Book", "price": 29.99, "emoji": "๐"}],
29.99)
order2 = Order("ORD-002", "USER-123",
[{"item": "Coffee Mug", "price": 12.99, "emoji": "โ"}],
12.99)
order3 = Order("ORD-003", "USER-456",
[{"item": "Keyboard", "price": 89.99, "emoji": "โจ๏ธ"}],
89.99)
order_system.create_order(order1)
order_system.create_order(order2)
order_system.create_order(order3)
# Get user orders (fast because they're all in one shard!)
user_orders = order_system.get_user_orders("USER-123")
print(f"\n๐ Found {len(user_orders)} orders for USER-123")
๐ฏ Try it yourself: Add a method to calculate total revenue per shard and implement cross-shard analytics!
๐ฎ Example 2: Gaming Leaderboard System
Letโs make a sharded leaderboard for a multiplayer game:
# ๐ Sharded gaming leaderboard system
import heapq
from collections import defaultdict
class Player:
"""๐ฎ Represents a game player"""
def __init__(self, player_id: str, username: str, region: str):
self.player_id = player_id
self.username = username
self.region = region
self.score = 0
self.level = 1
self.achievements = []
self.emoji = "๐ฎ"
class LeaderboardShardSystem:
"""๐
Sharded leaderboard for global gaming"""
def __init__(self):
# Geographic sharding for low latency! ๐
self.region_shards = {
"NA": DatabaseShard("north_america"), # ๐
"EU": DatabaseShard("europe"), # ๐
"ASIA": DatabaseShard("asia"), # ๐
"SA": DatabaseShard("south_america") # ๐
}
# Score buckets for efficient ranking ๐
self.score_buckets = defaultdict(list)
print("๐ Global leaderboard system initialized!")
def add_player(self, player: Player) -> None:
"""โ Add new player to regional shard"""
shard = self.region_shards.get(player.region, self.region_shards["NA"])
player_data = {
"player_id": player.player_id,
"username": player.username,
"score": player.score,
"level": player.level,
"achievements": player.achievements,
"region": player.region
}
shard.insert(player.player_id, player_data)
print(f"๐ฏ Player {player.username} joined {player.region} region!")
def update_score(self, player_id: str, region: str, points: int) -> None:
"""๐ฏ Update player score"""
shard = self.region_shards.get(region)
player_data = shard.get(player_id)
if player_data:
old_score = player_data["score"]
player_data["score"] += points
# Level up every 1000 points! ๐
new_level = (player_data["score"] // 1000) + 1
if new_level > player_data["level"]:
player_data["level"] = new_level
player_data["achievements"].append(f"๐ Level {new_level} Master")
print(f"๐ {player_data['username']} leveled up to {new_level}!")
shard.insert(player_id, player_data)
self._update_score_bucket(player_id, old_score, player_data["score"])
print(f"โจ {player_data['username']} earned {points} points!")
def _update_score_bucket(self, player_id: str, old_score: int, new_score: int):
"""๐ Update score buckets for efficient ranking"""
old_bucket = old_score // 1000
new_bucket = new_score // 1000
if old_bucket != new_bucket:
if player_id in self.score_buckets[old_bucket]:
self.score_buckets[old_bucket].remove(player_id)
self.score_buckets[new_bucket].append(player_id)
def get_regional_leaderboard(self, region: str, top_n: int = 10) -> List[Dict]:
"""๐
Get top players in a region"""
shard = self.region_shards.get(region)
if not shard:
return []
# Use heap for efficient top-N ๐ฏ
players = []
for player_id, player_data in shard.data.items():
heapq.heappush(players, (-player_data["score"], player_data))
# Get top N players
top_players = []
for _ in range(min(top_n, len(players))):
if players:
score, player = heapq.heappop(players)
top_players.append(player)
return top_players
def get_global_leaderboard(self, top_n: int = 10) -> List[Dict]:
"""๐ Get global top players (cross-shard query)"""
all_players = []
# Collect top players from each shard ๐
for region, shard in self.region_shards.items():
regional_top = self.get_regional_leaderboard(region, top_n)
all_players.extend(regional_top)
# Sort globally and return top N
all_players.sort(key=lambda x: x["score"], reverse=True)
return all_players[:top_n]
# ๐ฎ Let's play!
leaderboard = LeaderboardShardSystem()
# Add players from different regions
players = [
Player("P001", "DragonSlayer", "NA"),
Player("P002", "NinjaWarrior", "ASIA"),
Player("P003", "VikingKing", "EU"),
Player("P004", "AztecEagle", "SA")
]
for player in players:
leaderboard.add_player(player)
# Simulate gameplay
leaderboard.update_score("P001", "NA", 1500)
leaderboard.update_score("P002", "ASIA", 2000)
leaderboard.update_score("P003", "EU", 1800)
leaderboard.update_score("P004", "SA", 900)
# Get leaderboards
print("\n๐
North America Leaderboard:")
na_leaders = leaderboard.get_regional_leaderboard("NA", 5)
for i, player in enumerate(na_leaders, 1):
print(f" {i}. {player['username']} - {player['score']} points")
print("\n๐ Global Leaderboard:")
global_leaders = leaderboard.get_global_leaderboard(5)
for i, player in enumerate(global_leaders, 1):
print(f" {i}. {player['username']} - {player['score']} points")
๐ Advanced Concepts
๐งโโ๏ธ Advanced Topic 1: Consistent Hashing with Virtual Nodes
When youโre ready to level up, implement advanced consistent hashing:
# ๐ฏ Advanced consistent hashing with virtual nodes
import bisect
from hashlib import md5
class ConsistentHashRing:
"""๐ Consistent hash ring for better distribution"""
def __init__(self, nodes: List[str], virtual_nodes: int = 150):
self.nodes = nodes
self.virtual_nodes = virtual_nodes
self.ring = {}
self.sorted_keys = []
self._build_ring()
print(f"โจ Built hash ring with {len(nodes)} nodes and {virtual_nodes} virtual nodes each!")
def _hash(self, key: str) -> int:
"""๐ Generate hash value"""
return int(md5(key.encode()).hexdigest(), 16)
def _build_ring(self):
"""๐๏ธ Build the hash ring with virtual nodes"""
for node in self.nodes:
for i in range(self.virtual_nodes):
virtual_key = f"{node}:{i}"
hash_value = self._hash(virtual_key)
self.ring[hash_value] = node
bisect.insort(self.sorted_keys, hash_value)
def get_node(self, key: str) -> str:
"""๐ฏ Find node responsible for key"""
if not self.ring:
return None
hash_value = self._hash(key)
index = bisect.bisect_right(self.sorted_keys, hash_value)
# Wrap around to first node if needed ๐
if index == len(self.sorted_keys):
index = 0
return self.ring[self.sorted_keys[index]]
def add_node(self, node: str):
"""โ Add new node to ring (for scaling!)"""
self.nodes.append(node)
for i in range(self.virtual_nodes):
virtual_key = f"{node}:{i}"
hash_value = self._hash(virtual_key)
self.ring[hash_value] = node
bisect.insort(self.sorted_keys, hash_value)
print(f"๐ Added node {node} to the ring!")
def remove_node(self, node: str):
"""โ Remove node from ring"""
self.nodes.remove(node)
for i in range(self.virtual_nodes):
virtual_key = f"{node}:{i}"
hash_value = self._hash(virtual_key)
del self.ring[hash_value]
self.sorted_keys.remove(hash_value)
print(f"๐ Removed node {node} from the ring!")
# ๐ช Using the consistent hash ring
shard_nodes = ["shard_1", "shard_2", "shard_3"]
hash_ring = ConsistentHashRing(shard_nodes)
# Test distribution
test_keys = [f"user_{i}" for i in range(100)]
distribution = defaultdict(int)
for key in test_keys:
node = hash_ring.get_node(key)
distribution[node] += 1
print("\n๐ Key distribution:")
for node, count in distribution.items():
print(f" {node}: {count} keys ({count/len(test_keys)*100:.1f}%)")
๐๏ธ Advanced Topic 2: Cross-Shard Queries and Aggregation
For complex queries across shards:
# ๐ Cross-shard query engine
import asyncio
from concurrent.futures import ThreadPoolExecutor
from typing import Callable, Any
class ShardQueryEngine:
"""๐ Execute queries across multiple shards"""
def __init__(self, shards: List[DatabaseShard]):
self.shards = shards
self.executor = ThreadPoolExecutor(max_workers=len(shards))
print(f"๐ Query engine initialized for {len(shards)} shards!")
def map_reduce(self,
map_func: Callable[[DatabaseShard], Any],
reduce_func: Callable[[List[Any]], Any]) -> Any:
"""๐บ๏ธ Map-reduce pattern for cross-shard queries"""
# Map phase - parallel execution! โก
futures = []
for shard in self.shards:
future = self.executor.submit(map_func, shard)
futures.append(future)
# Collect results
results = []
for future in futures:
results.append(future.result())
# Reduce phase ๐
return reduce_func(results)
def aggregate_sum(self, field: str) -> float:
"""โ Sum a field across all shards"""
def map_sum(shard: DatabaseShard) -> float:
total = 0
for key, value in shard.data.items():
if isinstance(value, dict) and field in value:
total += value[field]
return total
def reduce_sum(totals: List[float]) -> float:
return sum(totals)
return self.map_reduce(map_sum, reduce_sum)
def search_all_shards(self, condition: Callable[[Any], bool]) -> List[Any]:
"""๐ Search across all shards with condition"""
def map_search(shard: DatabaseShard) -> List[Any]:
results = []
for key, value in shard.data.items():
if condition(value):
results.append(value)
return results
def reduce_search(shard_results: List[List[Any]]) -> List[Any]:
all_results = []
for results in shard_results:
all_results.extend(results)
return all_results
return self.map_reduce(map_search, reduce_search)
# ๐ฎ Example usage
# Create sample shards with data
shards = [DatabaseShard(f"shard_{i}") for i in range(3)]
# Add sample data
for i in range(30):
shard_idx = i % 3
shards[shard_idx].insert(f"order_{i}", {
"order_id": f"order_{i}",
"amount": 10 + (i * 5),
"status": "completed" if i % 2 == 0 else "pending"
})
# Create query engine
query_engine = ShardQueryEngine(shards)
# Calculate total revenue across all shards! ๐ฐ
total_revenue = query_engine.aggregate_sum("amount")
print(f"\n๐ฐ Total revenue across all shards: ${total_revenue}")
# Find all completed orders
completed_orders = query_engine.search_all_shards(
lambda order: isinstance(order, dict) and order.get("status") == "completed"
)
print(f"โ
Found {len(completed_orders)} completed orders across all shards")
โ ๏ธ Common Pitfalls and Solutions
๐ฑ Pitfall 1: Hot Shards
# โ Wrong way - creating hot shards!
class BadSharding:
def get_shard(self, user_id: str) -> int:
# Celebrity users all end up on shard 0! ๐ฅ
if user_id in ["celebrity1", "celebrity2", "celebrity3"]:
return 0
return hash(user_id) % 3
# โ
Correct way - even distribution!
class GoodSharding:
def __init__(self):
self.virtual_shards = 100 # More virtual shards
self.physical_shards = 3
def get_shard(self, user_id: str) -> int:
# Use consistent hashing for even distribution ๐ฏ
virtual_shard = hash(user_id) % self.virtual_shards
return virtual_shard % self.physical_shards
๐คฏ Pitfall 2: Cross-Shard Joins
# โ Dangerous - inefficient cross-shard joins!
def get_user_with_orders_bad(user_id: str, user_shard: DatabaseShard, order_shard: DatabaseShard):
user = user_shard.get(user_id) # User in one shard
orders = []
# Scanning entire order shard! ๐ฅ
for key, order in order_shard.data.items():
if order.get("user_id") == user_id:
orders.append(order)
return {"user": user, "orders": orders}
# โ
Better - keep related data together!
class UserOrderShard:
"""๐ฏ Keep user and their orders in same shard"""
def __init__(self):
self.data = {}
def add_user_with_orders(self, user_id: str, user_data: dict, orders: list):
self.data[user_id] = {
"user": user_data,
"orders": orders # All user orders in same shard! โจ
}
def get_user_with_orders(self, user_id: str):
return self.data.get(user_id, {})
๐ ๏ธ Best Practices
- ๐ฏ Choose the Right Sharding Key: Use keys that distribute data evenly
- ๐ Monitor Shard Balance: Track data distribution and rebalance when needed
- ๐ก๏ธ Plan for Resharding: Design your system to handle shard count changes
- ๐ Use Consistent Hashing: Minimize data movement when adding/removing shards
- ๐พ Keep Related Data Together: Avoid cross-shard joins by smart data placement
๐งช Hands-On Exercise
๐ฏ Challenge: Build a Sharded Social Media System
Create a sharded social media platform:
๐ Requirements:
- โ User profiles distributed across shards
- ๐ท๏ธ Posts stored with their authors (same shard)
- ๐ฅ Friend relationships with bidirectional lookups
- ๐ Analytics for post engagement
- ๐จ Each user needs a profile emoji!
๐ Bonus Points:
- Add timeline generation across shards
- Implement hashtag trending analysis
- Create a recommendation engine
๐ก Solution
๐ Click to see solution
# ๐ฏ Sharded social media system!
from datetime import datetime
import uuid
class SocialMediaShard:
"""๐ฑ Single shard for social media data"""
def __init__(self, shard_id: str):
self.shard_id = shard_id
self.users = {}
self.posts = {}
self.friendships = defaultdict(set)
class ShardedSocialMedia:
"""๐ Distributed social media platform"""
def __init__(self, num_shards: int = 4):
self.shards = [SocialMediaShard(f"social_shard_{i}") for i in range(num_shards)]
self.hash_ring = ConsistentHashRing([s.shard_id for s in self.shards])
print(f"๐ Social media platform initialized with {num_shards} shards!")
def _get_shard_for_user(self, user_id: str) -> SocialMediaShard:
"""๐ฏ Get shard for user using consistent hashing"""
shard_id = self.hash_ring.get_node(user_id)
for shard in self.shards:
if shard.shard_id == shard_id:
return shard
return self.shards[0]
def create_user(self, user_id: str, username: str, emoji: str) -> None:
"""๐ค Create new user profile"""
shard = self._get_shard_for_user(user_id)
shard.users[user_id] = {
"user_id": user_id,
"username": username,
"emoji": emoji,
"created_at": datetime.now().isoformat(),
"post_count": 0,
"friend_count": 0
}
print(f"โ
User {emoji} {username} created!")
def create_post(self, user_id: str, content: str) -> str:
"""๐ Create new post (stored with user)"""
shard = self._get_shard_for_user(user_id)
post_id = str(uuid.uuid4())
shard.posts[post_id] = {
"post_id": post_id,
"user_id": user_id,
"content": content,
"created_at": datetime.now().isoformat(),
"likes": 0,
"comments": []
}
# Update user post count
if user_id in shard.users:
shard.users[user_id]["post_count"] += 1
print(f"๐ฎ Post created by user {user_id}!")
return post_id
def add_friend(self, user_id: str, friend_id: str) -> None:
"""๐ฅ Add bidirectional friendship"""
# Store in both users' shards for fast lookup
user_shard = self._get_shard_for_user(user_id)
friend_shard = self._get_shard_for_user(friend_id)
user_shard.friendships[user_id].add(friend_id)
friend_shard.friendships[friend_id].add(user_id)
# Update friend counts
if user_id in user_shard.users:
user_shard.users[user_id]["friend_count"] += 1
if friend_id in friend_shard.users:
friend_shard.users[friend_id]["friend_count"] += 1
print(f"๐ค {user_id} and {friend_id} are now friends!")
def get_user_timeline(self, user_id: str, limit: int = 10) -> List[Dict]:
"""๐ Get user's timeline (own posts + friends' posts)"""
user_shard = self._get_shard_for_user(user_id)
timeline_posts = []
# Get user's own posts
for post_id, post in user_shard.posts.items():
if post["user_id"] == user_id:
timeline_posts.append(post)
# Get friends' posts (may require cross-shard queries)
friends = user_shard.friendships.get(user_id, set())
for friend_id in friends:
friend_shard = self._get_shard_for_user(friend_id)
for post_id, post in friend_shard.posts.items():
if post["user_id"] == friend_id:
timeline_posts.append(post)
# Sort by timestamp and return latest
timeline_posts.sort(key=lambda x: x["created_at"], reverse=True)
return timeline_posts[:limit]
def get_trending_stats(self) -> Dict:
"""๐ Get platform-wide statistics"""
total_users = 0
total_posts = 0
total_friendships = 0
for shard in self.shards:
total_users += len(shard.users)
total_posts += len(shard.posts)
total_friendships += sum(len(friends) for friends in shard.friendships.values())
return {
"total_users": total_users,
"total_posts": total_posts,
"total_friendships": total_friendships // 2, # Bidirectional
"avg_posts_per_user": total_posts / max(total_users, 1)
}
# ๐ฎ Test the system!
social_media = ShardedSocialMedia(num_shards=3)
# Create users
users = [
("user_001", "Alice", "๐ฉโ๐ป"),
("user_002", "Bob", "๐จโ๐ผ"),
("user_003", "Charlie", "๐จโ๐จ"),
("user_004", "Diana", "๐ฉโ๐ฌ")
]
for user_id, username, emoji in users:
social_media.create_user(user_id, username, emoji)
# Create friendships
social_media.add_friend("user_001", "user_002")
social_media.add_friend("user_001", "user_003")
social_media.add_friend("user_002", "user_004")
# Create posts
social_media.create_post("user_001", "Hello sharded world! ๐")
social_media.create_post("user_002", "Database sharding is awesome! ๐")
social_media.create_post("user_003", "Learning Python every day! ๐")
# Get timeline
print("\n๐ Alice's Timeline:")
timeline = social_media.get_user_timeline("user_001")
for post in timeline:
print(f" - {post['content']} (by {post['user_id']})")
# Get stats
stats = social_media.get_trending_stats()
print(f"\n๐ Platform Statistics:")
print(f" Users: {stats['total_users']}")
print(f" Posts: {stats['total_posts']}")
print(f" Friendships: {stats['total_friendships']}")
๐ Key Takeaways
Youโve learned so much! Hereโs what you can now do:
- โ Implement database sharding with confidence ๐ช
- โ Choose appropriate sharding strategies for your use case ๐ก๏ธ
- โ Build scalable distributed systems in Python ๐ฏ
- โ Handle cross-shard queries efficiently ๐
- โ Design for horizontal scaling from the start! ๐
Remember: Sharding is powerful but comes with complexity. Start simple and shard when you need to scale! ๐ค
๐ค Next Steps
Congratulations! ๐ Youโve mastered database sharding!
Hereโs what to do next:
- ๐ป Practice with the social media exercise above
- ๐๏ธ Build a sharded analytics system for your projects
- ๐ Learn about shard rebalancing and migration strategies
- ๐ Explore real-world sharding in MongoDB or Cassandra!
Remember: Every scalable system started with understanding sharding. Keep building, keep scaling, and most importantly, have fun! ๐
Happy sharding! ๐๐โจ