+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Part 499 of 541

📘 Database Testing: Test Data Management

Master database testing: test data management in Python with practical examples, best practices, and real-world applications 🚀

🚀Intermediate
30 min read

Prerequisites

  • Basic understanding of programming concepts 📝
  • Python installation (3.8+) 🐍
  • VS Code or preferred IDE 💻

What you'll learn

  • Understand the concept fundamentals 🎯
  • Apply the concept in real projects 🏗️
  • Debug common issues 🐛
  • Write clean, Pythonic code ✨

🎯 Introduction

Welcome to this exciting tutorial on database testing and test data management! 🎉 Testing databases is crucial for ensuring your applications work correctly, but managing test data can be tricky.

Have you ever broken production because your tests used real data? Or struggled with flaky tests that only work sometimes? 😅 Today, we’ll solve these problems once and for all!

You’ll discover how proper test data management can transform your database testing experience. Whether you’re building web applications 🌐, APIs 🖥️, or data pipelines 📊, mastering test data management is essential for writing reliable, maintainable tests.

By the end of this tutorial, you’ll feel confident managing test data like a pro! Let’s dive in! 🏊‍♂️

📚 Understanding Database Testing & Test Data Management

🤔 What is Test Data Management?

Test data management is like having a perfectly organized kitchen for cooking 🍳. Just as a chef needs fresh ingredients ready before cooking, your tests need clean, predictable data before running.

Think of it as creating a parallel universe 🌌 for your database where you can:

  • ✨ Create data exactly how you need it
  • 🚀 Reset everything instantly between tests
  • 🛡️ Never touch production data
  • 🎯 Test edge cases safely

In Python terms, test data management means controlling the lifecycle of data used in your tests. This includes:

  • 📦 Fixtures: Pre-defined test data sets
  • 🏭 Factories: Dynamic data generators
  • 🔄 Transactions: Isolated test runs
  • 🧹 Cleanup: Automatic data removal

💡 Why Use Proper Test Data Management?

Here’s why developers love good test data management:

  1. Isolated Tests 🔒: Each test runs in its own bubble
  2. Predictable Results 💻: Same input = same output, always
  3. Fast Execution ⚡: No waiting for complex data setup
  4. Edge Case Testing 🎯: Easily test unusual scenarios

Real-world example: Imagine testing an e-commerce system 🛒. With proper test data management, you can instantly create orders with specific states, test refunds without real money, and verify inventory without affecting real stock!

🔧 Basic Syntax and Usage

📝 Setting Up Test Database

Let’s start with a simple SQLite test database:

# 👋 Hello, Test Database!
import sqlite3
import pytest
from contextlib import contextmanager

# 🎨 Create a test database connection
@contextmanager
def test_database():
    # 📊 In-memory database for speed!
    conn = sqlite3.connect(':memory:')
    conn.row_factory = sqlite3.Row  # 🎯 Dict-like access
    
    # 🏗️ Create test schema
    cursor = conn.cursor()
    cursor.execute('''
        CREATE TABLE users (
            id INTEGER PRIMARY KEY,
            name TEXT NOT NULL,
            email TEXT UNIQUE,
            emoji TEXT DEFAULT '😊'
        )
    ''')
    conn.commit()
    
    yield conn
    
    # 🧹 Automatic cleanup!
    conn.close()

# 🧪 Using the test database
with test_database() as db:
    cursor = db.cursor()
    cursor.execute(
        "INSERT INTO users (name, email) VALUES (?, ?)",
        ("Test User", "[email protected]")
    )
    print("Test data inserted! 🎉")

💡 Explanation: We use an in-memory database that’s super fast and automatically cleaned up. The context manager ensures proper cleanup even if tests fail!

🎯 Creating Test Fixtures

Here’s how to create reusable test data:

# 🏭 Test data fixtures
class TestFixtures:
    @staticmethod
    def create_user(conn, name="Alice", email=None, emoji="🧑‍💻"):
        # 🎨 Generate unique email if not provided
        if email is None:
            import time
            email = f"{name.lower()}.{int(time.time())}@test.com"
        
        cursor = conn.cursor()
        cursor.execute(
            "INSERT INTO users (name, email, emoji) VALUES (?, ?, ?)",
            (name, email, emoji)
        )
        conn.commit()
        
        # 🎯 Return the created user ID
        return cursor.lastrowid

    @staticmethod
    def create_product(conn, name="Test Product", price=9.99, emoji="📦"):
        cursor = conn.cursor()
        cursor.execute('''
            CREATE TABLE IF NOT EXISTS products (
                id INTEGER PRIMARY KEY,
                name TEXT NOT NULL,
                price REAL NOT NULL,
                emoji TEXT
            )
        ''')
        
        cursor.execute(
            "INSERT INTO products (name, price, emoji) VALUES (?, ?, ?)",
            (name, price, emoji)
        )
        conn.commit()
        return cursor.lastrowid

# 🧪 Using fixtures in tests
def test_user_creation():
    with test_database() as db:
        # ✨ Create test users easily!
        user1_id = TestFixtures.create_user(db, "Alice", emoji="👩‍💻")
        user2_id = TestFixtures.create_user(db, "Bob", emoji="👨‍💻")
        
        # 🔍 Verify they exist
        cursor = db.cursor()
        cursor.execute("SELECT COUNT(*) as count FROM users")
        assert cursor.fetchone()['count'] == 2
        print("Users created successfully! ✅")

💡 Practical Examples

🛒 Example 1: E-Commerce Order Testing

Let’s build a complete test data management system for an e-commerce platform:

# 🛍️ E-commerce test data manager
import datetime
from decimal import Decimal
import random

class EcommerceTestData:
    def __init__(self, db_connection):
        self.db = db_connection
        self._create_schema()
    
    def _create_schema(self):
        # 🏗️ Create all necessary tables
        cursor = self.db.cursor()
        
        # 👤 Users table
        cursor.execute('''
            CREATE TABLE IF NOT EXISTS users (
                id INTEGER PRIMARY KEY,
                username TEXT UNIQUE,
                email TEXT UNIQUE,
                created_at TIMESTAMP,
                is_premium BOOLEAN DEFAULT 0
            )
        ''')
        
        # 📦 Products table  
        cursor.execute('''
            CREATE TABLE IF NOT EXISTS products (
                id INTEGER PRIMARY KEY,
                name TEXT NOT NULL,
                price DECIMAL(10,2),
                stock INTEGER DEFAULT 0,
                category TEXT
            )
        ''')
        
        # 🛒 Orders table
        cursor.execute('''
            CREATE TABLE IF NOT EXISTS orders (
                id INTEGER PRIMARY KEY,
                user_id INTEGER,
                total DECIMAL(10,2),
                status TEXT DEFAULT 'pending',
                created_at TIMESTAMP,
                FOREIGN KEY (user_id) REFERENCES users(id)
            )
        ''')
        
        self.db.commit()
    
    # 🎨 Factory methods for test data
    def create_test_user(self, username=None, is_premium=False):
        if username is None:
            username = f"testuser_{random.randint(1000, 9999)}"
        
        cursor = self.db.cursor()
        cursor.execute('''
            INSERT INTO users (username, email, created_at, is_premium)
            VALUES (?, ?, ?, ?)
        ''', (
            username,
            f"{username}@test.com",
            datetime.datetime.now(),
            is_premium
        ))
        self.db.commit()
        
        print(f"✨ Created {'premium' if is_premium else 'regular'} user: {username}")
        return cursor.lastrowid
    
    def create_test_product(self, name=None, price=None, stock=10):
        if name is None:
            # 🎲 Random product names with emojis!
            products = [
                ("Gaming Mouse", "🖱️"), 
                ("Mechanical Keyboard", "⌨️"),
                ("Monitor", "🖥️"),
                ("Headphones", "🎧")
            ]
            name, emoji = random.choice(products)
            name = f"{emoji} {name}"
        
        if price is None:
            price = Decimal(str(random.uniform(10, 200)))
        
        cursor = self.db.cursor()
        cursor.execute('''
            INSERT INTO products (name, price, stock, category)
            VALUES (?, ?, ?, ?)
        ''', (name, price, stock, "electronics"))
        self.db.commit()
        
        print(f"📦 Created product: {name} (${price})")
        return cursor.lastrowid
    
    def create_test_order(self, user_id, product_ids, status="pending"):
        # 💰 Calculate total from products
        cursor = self.db.cursor()
        placeholders = ','.join('?' * len(product_ids))
        cursor.execute(
            f"SELECT SUM(price) as total FROM products WHERE id IN ({placeholders})",
            product_ids
        )
        total = cursor.fetchone()[0] or 0
        
        # 🛒 Create order
        cursor.execute('''
            INSERT INTO orders (user_id, total, status, created_at)
            VALUES (?, ?, ?, ?)
        ''', (user_id, total, status, datetime.datetime.now()))
        
        order_id = cursor.lastrowid
        self.db.commit()
        
        print(f"🎉 Created {status} order #{order_id} for ${total}")
        return order_id
    
    # 🧹 Cleanup methods
    def cleanup_all(self):
        cursor = self.db.cursor()
        cursor.execute("DELETE FROM orders")
        cursor.execute("DELETE FROM products")  
        cursor.execute("DELETE FROM users")
        self.db.commit()
        print("🧹 All test data cleaned up!")

# 🧪 Using the test data manager
def test_order_flow():
    with test_database() as db:
        # 🎯 Initialize test data manager
        test_data = EcommerceTestData(db)
        
        # 🏗️ Create test scenario
        user_id = test_data.create_test_user("alice", is_premium=True)
        product1 = test_data.create_test_product("🎮 PS5", Decimal("499.99"))
        product2 = test_data.create_test_product("🎮 Extra Controller", Decimal("69.99"))
        
        # 🛒 Create an order
        order_id = test_data.create_test_order(
            user_id, 
            [product1, product2],
            status="completed"
        )
        
        # ✅ Verify the order
        cursor = db.cursor()
        cursor.execute("SELECT * FROM orders WHERE id = ?", (order_id,))
        order = cursor.fetchone()
        
        assert order['total'] == 569.98
        assert order['status'] == 'completed'
        print("Order test passed! ✅")
        
        # 🧹 Cleanup
        test_data.cleanup_all()

🎯 Try it yourself: Add a method to create test reviews and link them to products and users!

🎮 Example 2: Game Leaderboard Testing

Let’s create a test data system for a gaming leaderboard:

# 🏆 Game leaderboard test data
import json
from datetime import datetime, timedelta

class GameTestDataBuilder:
    def __init__(self, db):
        self.db = db
        self._setup_tables()
        
    def _setup_tables(self):
        cursor = self.db.cursor()
        
        # 🎮 Players table
        cursor.execute('''
            CREATE TABLE IF NOT EXISTS players (
                id INTEGER PRIMARY KEY,
                username TEXT UNIQUE,
                level INTEGER DEFAULT 1,
                xp INTEGER DEFAULT 0,
                joined_at TIMESTAMP
            )
        ''')
        
        # 🏆 Scores table
        cursor.execute('''
            CREATE TABLE IF NOT EXISTS scores (
                id INTEGER PRIMARY KEY,
                player_id INTEGER,
                game_mode TEXT,
                score INTEGER,
                combo INTEGER DEFAULT 0,
                played_at TIMESTAMP,
                metadata TEXT,
                FOREIGN KEY (player_id) REFERENCES players(id)
            )
        ''')
        
        self.db.commit()
    
    # 🎯 Fluent builder pattern for test data
    class PlayerBuilder:
        def __init__(self, db, username):
            self.db = db
            self.username = username
            self.level = 1
            self.xp = 0
            self.scores = []
            
        def at_level(self, level):
            # 📈 Set player level
            self.level = level
            self.xp = level * 1000  # 🎯 1000 XP per level
            return self
            
        def with_score(self, score, mode="classic", combo=0):
            # 🎮 Add a score
            self.scores.append({
                'score': score,
                'mode': mode,
                'combo': combo,
                'metadata': json.dumps({
                    'perfect_rounds': combo > 10,
                    'power_ups_used': ['🚀', '⚡'] if score > 10000 else []
                })
            })
            return self
            
        def build(self):
            # 🏗️ Create the player and scores
            cursor = self.db.cursor()
            
            # Create player
            cursor.execute('''
                INSERT INTO players (username, level, xp, joined_at)
                VALUES (?, ?, ?, ?)
            ''', (
                self.username,
                self.level,
                self.xp,
                datetime.now() - timedelta(days=30)  # 📅 Joined 30 days ago
            ))
            player_id = cursor.lastrowid
            
            # Add scores
            for score_data in self.scores:
                cursor.execute('''
                    INSERT INTO scores (player_id, game_mode, score, combo, played_at, metadata)
                    VALUES (?, ?, ?, ?, ?, ?)
                ''', (
                    player_id,
                    score_data['mode'],
                    score_data['score'],
                    score_data['combo'],
                    datetime.now() - timedelta(hours=len(self.scores)),
                    score_data['metadata']
                ))
            
            self.db.commit()
            print(f"🎮 Created player '{self.username}' at level {self.level} with {len(self.scores)} scores!")
            return player_id
    
    def player(self, username):
        # 🎨 Start building a player
        return self.PlayerBuilder(self.db, username)
    
    def create_tournament_scenario(self):
        # 🏆 Create a complete tournament test scenario
        print("🎊 Creating tournament test data...")
        
        # Top players
        self.player("ProGamer123") \
            .at_level(50) \
            .with_score(99999, "tournament", combo=50) \
            .with_score(85000, "tournament", combo=45) \
            .build()
            
        self.player("SpeedRunner") \
            .at_level(45) \
            .with_score(95000, "tournament", combo=48) \
            .with_score(82000, "tournament", combo=40) \
            .build()
            
        # Mid-tier players  
        self.player("CasualPlayer") \
            .at_level(20) \
            .with_score(50000, "tournament", combo=20) \
            .build()
            
        # New player
        self.player("Noob2024") \
            .at_level(5) \
            .with_score(10000, "tournament", combo=5) \
            .build()
            
        print("🏁 Tournament scenario ready!")
    
    def get_leaderboard(self, mode="tournament", limit=10):
        # 📊 Get leaderboard data
        cursor = self.db.cursor()
        cursor.execute('''
            SELECT 
                p.username,
                p.level,
                s.score,
                s.combo,
                s.metadata
            FROM scores s
            JOIN players p ON s.player_id = p.id
            WHERE s.game_mode = ?
            ORDER BY s.score DESC
            LIMIT ?
        ''', (mode, limit))
        
        return cursor.fetchall()

# 🧪 Test the game leaderboard
def test_leaderboard_rankings():
    with test_database() as db:
        # 🎮 Create test data
        builder = GameTestDataBuilder(db)
        builder.create_tournament_scenario()
        
        # 📊 Get leaderboard
        leaderboard = builder.get_leaderboard()
        
        # ✅ Verify rankings
        assert len(leaderboard) >= 4
        assert leaderboard[0]['username'] == "ProGamer123"
        assert leaderboard[0]['score'] == 99999
        
        # 🎯 Check score order
        for i in range(len(leaderboard) - 1):
            assert leaderboard[i]['score'] >= leaderboard[i + 1]['score']
        
        print("\n🏆 Tournament Leaderboard:")
        for i, entry in enumerate(leaderboard[:5]):
            medals = ["🥇", "🥈", "🥉", "4️⃣", "5️⃣"]
            print(f"{medals[i]} {entry['username']} - {entry['score']:,} points (combo: {entry['combo']})")

🚀 Advanced Concepts

🧙‍♂️ Advanced Topic 1: Factory Pattern with Faker

When you need realistic test data at scale:

# 🎯 Advanced test data factory
from faker import Faker
import random

class AdvancedTestFactory:
    def __init__(self, db):
        self.db = db
        self.fake = Faker()
        
    def create_realistic_user_batch(self, count=10):
        # 🎨 Generate realistic user data
        users = []
        cursor = self.db.cursor()
        
        for _ in range(count):
            user_data = {
                'name': self.fake.name(),
                'email': self.fake.email(),
                'phone': self.fake.phone_number(),
                'address': self.fake.address().replace('\n', ', '),
                'joined': self.fake.date_between('-1y', 'today'),
                'avatar_emoji': random.choice(['😊', '😎', '🤓', '🥳', '🤠'])
            }
            
            cursor.execute('''
                INSERT INTO users (name, email, phone, address, joined_at, avatar)
                VALUES (?, ?, ?, ?, ?, ?)
            ''', tuple(user_data.values()))
            
            user_data['id'] = cursor.lastrowid
            users.append(user_data)
        
        self.db.commit()
        print(f"✨ Created {count} realistic users with Faker!")
        return users
    
    def create_time_series_data(self, metric_name, days=30):
        # 📊 Generate time series test data
        cursor = self.db.cursor()
        cursor.execute('''
            CREATE TABLE IF NOT EXISTS metrics (
                id INTEGER PRIMARY KEY,
                metric_name TEXT,
                value REAL,
                recorded_at TIMESTAMP
            )
        ''')
        
        base_value = 100
        for day in range(days):
            # 📈 Simulate realistic fluctuations
            daily_values = []
            for hour in range(24):
                value = base_value + random.gauss(0, 10)  # 🎲 Normal distribution
                value = max(0, value)  # No negative values
                
                timestamp = datetime.now() - timedelta(days=days-day, hours=23-hour)
                
                cursor.execute('''
                    INSERT INTO metrics (metric_name, value, recorded_at)
                    VALUES (?, ?, ?)
                ''', (metric_name, value, timestamp))
                
                daily_values.append(value)
            
            # 📊 Trend simulation
            base_value += random.uniform(-2, 3)
        
        self.db.commit()
        print(f"📊 Generated {days} days of {metric_name} data!")

🏗️ Advanced Topic 2: Test Data Snapshots

For complex scenarios, use snapshots:

# 🚀 Snapshot-based test data management
import pickle
import hashlib

class TestDataSnapshots:
    def __init__(self, db):
        self.db = db
        self.snapshots = {}
        
    def create_snapshot(self, name):
        # 📸 Capture current database state
        cursor = self.db.cursor()
        
        # Get all tables
        cursor.execute("SELECT name FROM sqlite_master WHERE type='table'")
        tables = [row[0] for row in cursor.fetchall()]
        
        snapshot_data = {}
        for table in tables:
            cursor.execute(f"SELECT * FROM {table}")
            snapshot_data[table] = cursor.fetchall()
        
        # 🔐 Create snapshot with hash
        snapshot_hash = hashlib.md5(
            pickle.dumps(snapshot_data)
        ).hexdigest()[:8]
        
        self.snapshots[name] = {
            'data': snapshot_data,
            'hash': snapshot_hash,
            'created_at': datetime.now()
        }
        
        print(f"📸 Created snapshot '{name}' (hash: {snapshot_hash})")
        return snapshot_hash
    
    def restore_snapshot(self, name):
        # 🔄 Restore database to snapshot state
        if name not in self.snapshots:
            raise ValueError(f"Snapshot '{name}' not found!")
        
        snapshot = self.snapshots[name]
        cursor = self.db.cursor()
        
        # Clear all tables
        cursor.execute("SELECT name FROM sqlite_master WHERE type='table'")
        tables = [row[0] for row in cursor.fetchall()]
        
        for table in tables:
            cursor.execute(f"DELETE FROM {table}")
        
        # Restore data
        for table, rows in snapshot['data'].items():
            if rows:
                # Build INSERT statement
                placeholders = ','.join(['?' for _ in rows[0]])
                cursor.executemany(
                    f"INSERT INTO {table} VALUES ({placeholders})",
                    rows
                )
        
        self.db.commit()
        print(f"✨ Restored snapshot '{name}' (hash: {snapshot['hash']})")
    
    def compare_with_snapshot(self, name):
        # 🔍 Check if current state matches snapshot
        current_snapshot = self.create_snapshot("_temp_compare")
        
        if name not in self.snapshots:
            return False
            
        original_hash = self.snapshots[name]['hash']
        current_hash = self.snapshots["_temp_compare"]['hash']
        
        # Clean up temp snapshot
        del self.snapshots["_temp_compare"]
        
        matches = original_hash == current_hash
        print(f"🔍 Database {'matches' if matches else 'differs from'} snapshot '{name}'")
        return matches

⚠️ Common Pitfalls and Solutions

😱 Pitfall 1: Test Data Pollution

# ❌ Wrong way - tests affect each other!
def bad_test_1(db):
    cursor = db.cursor()
    cursor.execute("INSERT INTO users (name) VALUES ('Alice')")
    db.commit()
    # 💥 Data remains for next test!

def bad_test_2(db):
    cursor = db.cursor() 
    cursor.execute("SELECT COUNT(*) FROM users")
    count = cursor.fetchone()[0]
    assert count == 0  # 💥 Fails! Alice is still there!

# ✅ Correct way - use transactions or cleanup!
def good_test_1(db):
    try:
        cursor = db.cursor()
        cursor.execute("BEGIN")
        cursor.execute("INSERT INTO users (name) VALUES ('Alice')")
        # Do test assertions here
        cursor.execute("ROLLBACK")  # 🧹 Clean up!
    except Exception as e:
        cursor.execute("ROLLBACK")
        raise

# ✅ Or use pytest fixtures
@pytest.fixture
def clean_db():
    with test_database() as db:
        yield db
        # 🧹 Automatic cleanup after each test!

🤯 Pitfall 2: Hardcoded Test Data

# ❌ Dangerous - hardcoded IDs break easily!
def bad_test_with_ids():
    cursor.execute("INSERT INTO users (id, name) VALUES (1, 'Alice')")
    cursor.execute("INSERT INTO orders (user_id) VALUES (1)")
    # 💥 Breaks if ID 1 already exists!

# ✅ Safe - use returned IDs!
def good_test_with_ids():
    cursor.execute("INSERT INTO users (name) VALUES (?)", ("Alice",))
    user_id = cursor.lastrowid  # 🎯 Get actual ID
    
    cursor.execute("INSERT INTO orders (user_id) VALUES (?)", (user_id,))
    print(f"✅ Created order for user {user_id}")

🛠️ Best Practices

  1. 🎯 Use Transactions: Wrap tests in transactions for easy rollback
  2. 📝 Factory Pattern: Create reusable data builders
  3. 🛡️ Isolated Databases: Each test gets fresh data
  4. 🎨 Realistic Data: Use Faker for believable test data
  5. ✨ Clean As You Go: Always clean up after tests

🧪 Hands-On Exercise

🎯 Challenge: Build a Social Media Test Data System

Create a comprehensive test data management system for a social media app:

📋 Requirements:

  • ✅ Users with profiles and friend connections
  • 🏷️ Posts with likes, comments, and shares
  • 👤 Private messages between users
  • 📅 Activity feeds with proper timestamps
  • 🎨 Each user needs a profile emoji!

🚀 Bonus Points:

  • Add hashtag tracking
  • Implement follower/following relationships
  • Create trending topic test data

💡 Solution

🔍 Click to see solution
# 🎯 Social media test data system!
import random
from datetime import datetime, timedelta
from faker import Faker

class SocialMediaTestData:
    def __init__(self, db):
        self.db = db
        self.fake = Faker()
        self._create_schema()
        
    def _create_schema(self):
        cursor = self.db.cursor()
        
        # 👤 Users
        cursor.execute('''
            CREATE TABLE users (
                id INTEGER PRIMARY KEY,
                username TEXT UNIQUE,
                bio TEXT,
                emoji TEXT,
                created_at TIMESTAMP
            )
        ''')
        
        # 👥 Friendships
        cursor.execute('''
            CREATE TABLE friendships (
                user_id INTEGER,
                friend_id INTEGER,
                created_at TIMESTAMP,
                PRIMARY KEY (user_id, friend_id)
            )
        ''')
        
        # 📝 Posts
        cursor.execute('''
            CREATE TABLE posts (
                id INTEGER PRIMARY KEY,
                user_id INTEGER,
                content TEXT,
                likes INTEGER DEFAULT 0,
                created_at TIMESTAMP
            )
        ''')
        
        # 💬 Comments
        cursor.execute('''
            CREATE TABLE comments (
                id INTEGER PRIMARY KEY,
                post_id INTEGER,
                user_id INTEGER,
                content TEXT,
                created_at TIMESTAMP
            )
        ''')
        
        self.db.commit()
    
    def create_social_network(self, user_count=10):
        # 🌐 Create a realistic social network
        users = []
        
        # Create users
        emojis = ['😊', '😎', '🤓', '🥳', '🤠', '🦄', '🐱', '🎨', '🚀', '💫']
        for i in range(user_count):
            username = self.fake.user_name() + str(random.randint(10, 99))
            bio = self.fake.sentence()
            emoji = emojis[i % len(emojis)]
            
            cursor = self.db.cursor()
            cursor.execute('''
                INSERT INTO users (username, bio, emoji, created_at)
                VALUES (?, ?, ?, ?)
            ''', (username, bio, emoji, datetime.now() - timedelta(days=random.randint(1, 365))))
            
            users.append({
                'id': cursor.lastrowid,
                'username': username,
                'emoji': emoji
            })
        
        # Create friendships (each user has 2-5 friends)
        for user in users:
            friend_count = random.randint(2, min(5, len(users) - 1))
            potential_friends = [u for u in users if u['id'] != user['id']]
            friends = random.sample(potential_friends, friend_count)
            
            for friend in friends:
                cursor.execute('''
                    INSERT OR IGNORE INTO friendships (user_id, friend_id, created_at)
                    VALUES (?, ?, ?)
                ''', (user['id'], friend['id'], datetime.now() - timedelta(days=random.randint(1, 30))))
        
        # Create posts
        post_templates = [
            "Just finished a great workout! 💪",
            "Coffee time! ☕",
            "Working on something exciting... 🚀",
            "Beautiful sunset today 🌅",
            "Learning Python is fun! 🐍"
        ]
        
        for user in users:
            post_count = random.randint(1, 5)
            for _ in range(post_count):
                content = random.choice(post_templates) + f" {user['emoji']}"
                likes = random.randint(0, 50)
                
                cursor.execute('''
                    INSERT INTO posts (user_id, content, likes, created_at)
                    VALUES (?, ?, ?, ?)
                ''', (
                    user['id'], 
                    content, 
                    likes,
                    datetime.now() - timedelta(hours=random.randint(1, 168))
                ))
                
                post_id = cursor.lastrowid
                
                # Add some comments
                if random.random() > 0.5:
                    commenter = random.choice(users)
                    cursor.execute('''
                        INSERT INTO comments (post_id, user_id, content, created_at)
                        VALUES (?, ?, ?, ?)
                    ''', (
                        post_id,
                        commenter['id'],
                        f"Nice post! {commenter['emoji']}",
                        datetime.now() - timedelta(minutes=random.randint(1, 60))
                    ))
        
        self.db.commit()
        print(f"🌐 Created social network with {len(users)} users!")
        return users
    
    def get_user_feed(self, user_id):
        # 📊 Get activity feed for a user
        cursor = self.db.cursor()
        cursor.execute('''
            SELECT 
                p.*,
                u.username,
                u.emoji,
                (SELECT COUNT(*) FROM comments WHERE post_id = p.id) as comment_count
            FROM posts p
            JOIN users u ON p.user_id = u.id
            WHERE p.user_id = ? 
               OR p.user_id IN (SELECT friend_id FROM friendships WHERE user_id = ?)
            ORDER BY p.created_at DESC
            LIMIT 10
        ''', (user_id, user_id))
        
        return cursor.fetchall()

# 🧪 Test the social network
def test_social_media():
    with test_database() as db:
        social = SocialMediaTestData(db)
        users = social.create_social_network(5)
        
        # Get feed for first user
        feed = social.get_user_feed(users[0]['id'])
        
        print(f"\n📱 Feed for {users[0]['username']} {users[0]['emoji']}:")
        for post in feed:
            print(f"  {post['emoji']} {post['username']}: {post['content']}")
            print(f"     ❤️ {post['likes']} likes, 💬 {post['comment_count']} comments")

🎓 Key Takeaways

You’ve learned so much! Here’s what you can now do:

  • Create isolated test databases with confidence 💪
  • Manage test data lifecycle from creation to cleanup 🛡️
  • Build realistic test scenarios using factories and fixtures 🎯
  • Avoid common testing pitfalls like data pollution 🐛
  • Implement advanced patterns like snapshots and builders! 🚀

Remember: Good test data management is the foundation of reliable tests. It’s your safety net for confident refactoring! 🤝

🤝 Next Steps

Congratulations! 🎉 You’ve mastered database testing and test data management!

Here’s what to do next:

  1. 💻 Practice with the social media exercise above
  2. 🏗️ Add test data management to your existing projects
  3. 📚 Move on to our next tutorial: Database Testing: Writing Effective Test Cases
  4. 🌟 Share your test data patterns with your team!

Remember: Every database expert was once a beginner. Keep testing, keep learning, and most importantly, have fun! 🚀


Happy testing! 🎉🚀✨