Prerequisites
- Basic understanding of programming concepts ๐
- Python installation (3.8+) ๐
- VS Code or preferred IDE ๐ป
What you'll learn
- Understand the concept fundamentals ๐ฏ
- Apply the concept in real projects ๐๏ธ
- Debug common issues ๐
- Write clean, Pythonic code โจ
๐ฏ Introduction
Welcome to the wonderful world of SQL and databases! ๐ If youโve been writing Python code and wondering how to store and manage data permanently, youโre in the right place!
Today, weโll explore SQLโs two fundamental command categories: DDL (Data Definition Language) and DML (Data Manipulation Language). Think of DDL as the architect who designs the building ๐๏ธ, while DML is the team that moves furniture in and out ๐ฆ. Together, they give you superpowers to create, manage, and query databases like a pro!
By the end of this tutorial, youโll be creating tables, inserting data, and querying databases with confidence. Letโs dive into the data ocean! ๐โโ๏ธ
๐ Understanding DDL and DML
๐ค What are DDL and DML?
Think of a database as a filing cabinet ๐๏ธ:
- DDL is like building the cabinet, adding drawers, and labeling them
- DML is like putting files in, taking them out, and organizing them
In technical terms:
-
DDL (Data Definition Language) ๐๏ธ: Commands that define database structure
- โจ CREATE: Build new tables
- ๐ง ALTER: Modify existing tables
- ๐ฅ DROP: Delete tables (careful!)
- ๐ท๏ธ TRUNCATE: Empty tables quickly
-
DML (Data Manipulation Language) ๐ฆ: Commands that work with data
- โ INSERT: Add new data
- ๐ SELECT: Query and retrieve data
- โ๏ธ UPDATE: Modify existing data
- ๐๏ธ DELETE: Remove specific data
๐ก Why Learn SQL with Python?
Hereโs why Python developers love SQL:
- Persistent Storage ๐พ: Save data beyond program execution
- Efficient Queries โก: Find specific data in milliseconds
- Data Relationships ๐: Connect related information elegantly
- Industry Standard ๐: Used everywhere from startups to giants
Real-world example: Imagine building a library system ๐. You need to store books, members, and borrowing records. SQL makes this a breeze!
๐ง Basic Syntax and Usage
๐ Setting Up SQLite with Python
Letโs start with a friendly example using SQLite (Pythonโs built-in database):
import sqlite3
# ๐ Connect to database (creates if doesn't exist)
conn = sqlite3.connect('my_first_db.db')
cursor = conn.cursor()
# ๐จ Always remember to close when done!
# conn.close()
๐๏ธ DDL Commands in Action
# ๐ฏ CREATE: Building our first table
cursor.execute('''
CREATE TABLE IF NOT EXISTS books (
id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
author TEXT NOT NULL,
year INTEGER,
rating REAL
)
''')
# ๐ง ALTER: Adding a new column
cursor.execute('''
ALTER TABLE books
ADD COLUMN genre TEXT DEFAULT 'Unknown'
''')
# ๐ก Creating an index for faster searches
cursor.execute('''
CREATE INDEX idx_author
ON books(author)
''')
# โ
Commit changes
conn.commit()
print("๐ Table created successfully!")
๐ฆ DML Commands in Action
# โ INSERT: Adding books to our library
books_data = [
('The Python Cookbook', 'David Beazley', 2013, 4.8, 'Programming'),
('Automate the Boring Stuff', 'Al Sweigart', 2019, 4.7, 'Programming'),
('Pride and Prejudice', 'Jane Austen', 1813, 4.5, 'Fiction')
]
cursor.executemany('''
INSERT INTO books (title, author, year, rating, genre)
VALUES (?, ?, ?, ?, ?)
''', books_data)
# ๐ SELECT: Finding our books
cursor.execute('''
SELECT title, author, rating
FROM books
WHERE rating > 4.5
ORDER BY rating DESC
''')
print("โญ Top-rated books:")
for book in cursor.fetchall():
print(f" ๐ {book[0]} by {book[1]} - Rating: {book[2]}")
# โ๏ธ UPDATE: Fixing a mistake
cursor.execute('''
UPDATE books
SET year = 2020
WHERE title = 'Automate the Boring Stuff'
''')
# ๐๏ธ DELETE: Removing old books
cursor.execute('''
DELETE FROM books
WHERE year < 1900
''')
conn.commit()
๐ก Practical Examples
๐ Example 1: E-Commerce Product Catalog
Letโs build a product management system:
import sqlite3
from datetime import datetime
# ๐๏ธ Create our e-commerce database
conn = sqlite3.connect('shop.db')
cursor = conn.cursor()
# ๐๏ธ DDL: Design our product table
cursor.execute('''
CREATE TABLE IF NOT EXISTS products (
product_id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
price DECIMAL(10, 2) NOT NULL,
stock INTEGER DEFAULT 0,
category TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
''')
# ๐จ DDL: Create categories table
cursor.execute('''
CREATE TABLE IF NOT EXISTS categories (
category_id INTEGER PRIMARY KEY,
category_name TEXT UNIQUE NOT NULL,
emoji TEXT
)
''')
# ๐ฆ DML: Stock our store!
categories = [
(1, 'Electronics', '๐ฑ'),
(2, 'Books', '๐'),
(3, 'Clothing', '๐'),
(4, 'Food', '๐')
]
cursor.executemany('''
INSERT INTO categories VALUES (?, ?, ?)
''', categories)
# โ Add some products
products = [
('iPhone 13', 799.99, 50, 'Electronics'),
('Python Crash Course', 39.99, 100, 'Books'),
('Comfy T-Shirt', 19.99, 200, 'Clothing'),
('Organic Pizza', 12.99, 30, 'Food')
]
cursor.executemany('''
INSERT INTO products (name, price, stock, category)
VALUES (?, ?, ?, ?)
''', products)
# ๐ Smart queries with JOIN
cursor.execute('''
SELECT p.name, p.price, p.stock, c.emoji, c.category_name
FROM products p
JOIN categories c ON p.category = c.category_name
WHERE p.stock > 0
ORDER BY p.price DESC
''')
print("๐ Products in stock:")
for item in cursor.fetchall():
print(f" {item[3]} {item[0]} - ${item[1]:.2f} ({item[2]} available)")
# ๐ Analytics query
cursor.execute('''
SELECT category, COUNT(*) as count, AVG(price) as avg_price
FROM products
GROUP BY category
HAVING COUNT(*) > 0
''')
print("\n๐ Category Analytics:")
for stat in cursor.fetchall():
print(f" ๐ฆ {stat[0]}: {stat[1]} products, avg price: ${stat[2]:.2f}")
conn.commit()
conn.close()
๐ฎ Example 2: Game Score Leaderboard
Letโs create a gaming leaderboard system:
import sqlite3
from datetime import datetime
# ๐ฎ Game database
conn = sqlite3.connect('game_scores.db')
cursor = conn.cursor()
# ๐๏ธ DDL: Create player and scores tables
cursor.execute('''
CREATE TABLE IF NOT EXISTS players (
player_id INTEGER PRIMARY KEY AUTOINCREMENT,
username TEXT UNIQUE NOT NULL,
email TEXT UNIQUE NOT NULL,
joined_date DATE DEFAULT CURRENT_DATE
)
''')
cursor.execute('''
CREATE TABLE IF NOT EXISTS scores (
score_id INTEGER PRIMARY KEY AUTOINCREMENT,
player_id INTEGER,
game_mode TEXT,
score INTEGER NOT NULL,
achieved_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (player_id) REFERENCES players (player_id)
)
''')
# ๐ฏ Helper functions for our game
def add_player(username, email):
try:
cursor.execute('''
INSERT INTO players (username, email)
VALUES (?, ?)
''', (username, email))
conn.commit()
print(f"๐ Welcome {username}!")
return cursor.lastrowid
except sqlite3.IntegrityError:
print(f"โ ๏ธ Player {username} already exists!")
cursor.execute('SELECT player_id FROM players WHERE username = ?', (username,))
return cursor.fetchone()[0]
def add_score(player_id, game_mode, score):
cursor.execute('''
INSERT INTO scores (player_id, game_mode, score)
VALUES (?, ?, ?)
''', (player_id, game_mode, score))
conn.commit()
print(f"โจ Score recorded: {score} points!")
def get_leaderboard(game_mode, limit=10):
cursor.execute('''
SELECT p.username, s.score, s.achieved_at
FROM scores s
JOIN players p ON s.player_id = p.player_id
WHERE s.game_mode = ?
ORDER BY s.score DESC
LIMIT ?
''', (game_mode, limit))
return cursor.fetchall()
# ๐ฎ Let's play!
players = [
('DragonSlayer', '[email protected]'),
('SpeedRunner', '[email protected]'),
('ProGamer2023', '[email protected]'),
('NinjaWarrior', '[email protected]')
]
# Add players
player_ids = {}
for username, email in players:
player_ids[username] = add_player(username, email)
# ๐ฏ Record some scores
scores_data = [
('DragonSlayer', 'Classic', 9500),
('SpeedRunner', 'Classic', 8200),
('ProGamer2023', 'Classic', 9800),
('NinjaWarrior', 'Classic', 7500),
('DragonSlayer', 'Speed Run', 4200),
('SpeedRunner', 'Speed Run', 5100),
]
for username, mode, score in scores_data:
add_score(player_ids[username], mode, score)
# ๐ Display leaderboards
print("\n๐ Classic Mode Leaderboard:")
for rank, (username, score, date) in enumerate(get_leaderboard('Classic'), 1):
trophy = '๐ฅ' if rank == 1 else '๐ฅ' if rank == 2 else '๐ฅ' if rank == 3 else '๐ฏ'
print(f" {trophy} #{rank} {username}: {score} points")
# ๐ Player statistics
cursor.execute('''
SELECT p.username,
COUNT(s.score_id) as games_played,
MAX(s.score) as best_score,
AVG(s.score) as avg_score
FROM players p
LEFT JOIN scores s ON p.player_id = s.player_id
GROUP BY p.player_id
''')
print("\n๐ Player Statistics:")
for stats in cursor.fetchall():
if stats[1] > 0: # Has played games
print(f" ๐ค {stats[0]}: {stats[1]} games, best: {stats[2]}, avg: {stats[3]:.0f}")
conn.close()
๐ Advanced Concepts
๐งโโ๏ธ Transactions: All or Nothing!
When multiple operations must succeed together, use transactions:
import sqlite3
conn = sqlite3.connect('bank.db')
cursor = conn.cursor()
# ๐๏ธ Create accounts table
cursor.execute('''
CREATE TABLE IF NOT EXISTS accounts (
account_id INTEGER PRIMARY KEY,
holder TEXT NOT NULL,
balance DECIMAL(10, 2) DEFAULT 0.00
)
''')
# ๐ฐ Add some accounts
cursor.executemany('''
INSERT INTO accounts (holder, balance) VALUES (?, ?)
''', [('Alice', 1000.00), ('Bob', 500.00)])
def transfer_money(from_account, to_account, amount):
try:
# ๐ฏ Start transaction
conn.execute('BEGIN TRANSACTION')
# ๐ Check balance
cursor.execute('''
SELECT balance FROM accounts WHERE holder = ?
''', (from_account,))
balance = cursor.fetchone()[0]
if balance < amount:
raise ValueError("Insufficient funds! ๐ธ")
# ๐ธ Deduct from sender
cursor.execute('''
UPDATE accounts
SET balance = balance - ?
WHERE holder = ?
''', (amount, from_account))
# ๐ฐ Add to receiver
cursor.execute('''
UPDATE accounts
SET balance = balance + ?
WHERE holder = ?
''', (amount, to_account))
# โ
Commit if all successful
conn.commit()
print(f"โจ Transferred ${amount} from {from_account} to {to_account}")
except Exception as e:
# โ Rollback on any error
conn.rollback()
print(f"โ Transfer failed: {e}")
# ๐ฎ Test transfers
transfer_money('Alice', 'Bob', 250.00) # โ
Success
transfer_money('Bob', 'Alice', 1000.00) # โ Insufficient funds
conn.close()
๐๏ธ Views: Virtual Tables
Create reusable queries with views:
# ๐จ Create a view for high-value customers
cursor.execute('''
CREATE VIEW IF NOT EXISTS vip_customers AS
SELECT customer_id, name, total_spent
FROM (
SELECT c.customer_id, c.name, SUM(o.amount) as total_spent
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id
HAVING SUM(o.amount) > 1000
)
''')
# ๐ Query the view like a table
cursor.execute('SELECT * FROM vip_customers')
โ ๏ธ Common Pitfalls and Solutions
๐ฑ Pitfall 1: SQL Injection Attacks
# โ NEVER do this - vulnerable to SQL injection!
user_input = "'; DROP TABLE users; --"
query = f"SELECT * FROM users WHERE name = '{user_input}'"
# cursor.execute(query) # ๐ฅ Disaster!
# โ
Always use parameterized queries!
user_input = "Alice"
cursor.execute(
"SELECT * FROM users WHERE name = ?",
(user_input,)
)
๐คฏ Pitfall 2: Forgetting to Commit
# โ Changes won't persist!
cursor.execute("INSERT INTO items VALUES ('New Item')")
# Oops, forgot conn.commit()!
# โ
Always commit your changes!
cursor.execute("INSERT INTO items VALUES ('New Item')")
conn.commit() # โ
Now it's saved!
๐พ Pitfall 3: Not Closing Connections
# โ Resource leak!
conn = sqlite3.connect('mydb.db')
# ... do stuff ...
# Forgot to close!
# โ
Use context managers!
with sqlite3.connect('mydb.db') as conn:
cursor = conn.cursor()
# ... do stuff ...
# Automatically closed! ๐ฏ
๐ ๏ธ Best Practices
- ๐ฏ Use Parameterized Queries: Never concatenate user input
- ๐ Be Explicit: Specify column names in INSERT statements
- ๐ก๏ธ Add Constraints: Use NOT NULL, UNIQUE, CHECK constraints
- ๐ Index Wisely: Add indexes for frequently searched columns
- โจ Keep It Simple: Start with simple queries, then optimize
๐งช Hands-On Exercise
๐ฏ Challenge: Build a Task Management System
Create a complete task management database:
๐ Requirements:
- โ Users table with username and email
- ๐ Tasks table with title, description, status, and due date
- ๐ท๏ธ Categories for organizing tasks
- ๐ค Assign tasks to users
- ๐ Track completion statistics
๐ Bonus Points:
- Add priority levels (Low, Medium, High)
- Implement task dependencies
- Create a view for overdue tasks
- Add full-text search capability
๐ก Solution
๐ Click to see solution
import sqlite3
from datetime import datetime, timedelta
# ๐ฏ Create our task management system!
conn = sqlite3.connect('tasks.db')
cursor = conn.cursor()
# ๐๏ธ DDL: Create all tables
cursor.executescript('''
-- Users table
CREATE TABLE IF NOT EXISTS users (
user_id INTEGER PRIMARY KEY AUTOINCREMENT,
username TEXT UNIQUE NOT NULL,
email TEXT UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Categories table
CREATE TABLE IF NOT EXISTS categories (
category_id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT UNIQUE NOT NULL,
color TEXT,
emoji TEXT
);
-- Tasks table
CREATE TABLE IF NOT EXISTS tasks (
task_id INTEGER PRIMARY KEY AUTOINCREMENT,
title TEXT NOT NULL,
description TEXT,
status TEXT DEFAULT 'pending',
priority TEXT DEFAULT 'medium',
due_date DATE,
category_id INTEGER,
assigned_to INTEGER,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
completed_at TIMESTAMP,
FOREIGN KEY (category_id) REFERENCES categories (category_id),
FOREIGN KEY (assigned_to) REFERENCES users (user_id),
CHECK (status IN ('pending', 'in_progress', 'completed')),
CHECK (priority IN ('low', 'medium', 'high'))
);
-- Task dependencies
CREATE TABLE IF NOT EXISTS task_dependencies (
dependent_task_id INTEGER,
prerequisite_task_id INTEGER,
PRIMARY KEY (dependent_task_id, prerequisite_task_id),
FOREIGN KEY (dependent_task_id) REFERENCES tasks (task_id),
FOREIGN KEY (prerequisite_task_id) REFERENCES tasks (task_id)
);
-- Create indexes for performance
CREATE INDEX idx_tasks_status ON tasks(status);
CREATE INDEX idx_tasks_due_date ON tasks(due_date);
CREATE INDEX idx_tasks_assigned ON tasks(assigned_to);
-- View for overdue tasks
CREATE VIEW IF NOT EXISTS overdue_tasks AS
SELECT t.*, u.username, c.name as category_name, c.emoji
FROM tasks t
LEFT JOIN users u ON t.assigned_to = u.user_id
LEFT JOIN categories c ON t.category_id = c.category_id
WHERE t.status != 'completed'
AND t.due_date < date('now')
ORDER BY t.due_date;
''')
# ๐ฆ DML: Add sample data
# Add users
users = [
('alice_dev', '[email protected]'),
('bob_manager', '[email protected]'),
('charlie_designer', '[email protected]')
]
cursor.executemany('INSERT INTO users (username, email) VALUES (?, ?)', users)
# Add categories
categories = [
('Development', '#4CAF50', '๐ป'),
('Design', '#2196F3', '๐จ'),
('Testing', '#FF9800', '๐งช'),
('Documentation', '#9C27B0', '๐')
]
cursor.executemany('''
INSERT INTO categories (name, color, emoji) VALUES (?, ?, ?)
''', categories)
# Helper functions
def add_task(title, description, category_name, assigned_username,
due_days=7, priority='medium'):
# Get IDs
cursor.execute('SELECT category_id FROM categories WHERE name = ?', (category_name,))
category_id = cursor.fetchone()[0]
cursor.execute('SELECT user_id FROM users WHERE username = ?', (assigned_username,))
user_id = cursor.fetchone()[0]
due_date = (datetime.now() + timedelta(days=due_days)).date()
cursor.execute('''
INSERT INTO tasks (title, description, category_id, assigned_to, due_date, priority)
VALUES (?, ?, ?, ?, ?, ?)
''', (title, description, category_id, user_id, due_date, priority))
return cursor.lastrowid
def complete_task(task_id):
cursor.execute('''
UPDATE tasks
SET status = 'completed', completed_at = CURRENT_TIMESTAMP
WHERE task_id = ?
''', (task_id,))
def get_user_stats(username):
cursor.execute('''
SELECT
COUNT(*) as total_tasks,
SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) as completed,
SUM(CASE WHEN status = 'in_progress' THEN 1 ELSE 0 END) as in_progress,
SUM(CASE WHEN status = 'pending' THEN 1 ELSE 0 END) as pending
FROM tasks t
JOIN users u ON t.assigned_to = u.user_id
WHERE u.username = ?
''', (username,))
return cursor.fetchone()
# ๐ฎ Add sample tasks
tasks = [
('Fix login bug', 'Users cannot login with special characters',
'Development', 'alice_dev', 3, 'high'),
('Design new dashboard', 'Create mockups for analytics dashboard',
'Design', 'charlie_designer', 7, 'medium'),
('Write API tests', 'Add unit tests for user endpoints',
'Testing', 'alice_dev', 5, 'medium'),
('Update user guide', 'Add new features to documentation',
'Documentation', 'bob_manager', 10, 'low'),
]
task_ids = []
for task in tasks:
task_ids.append(add_task(*task))
# Complete some tasks
complete_task(task_ids[2]) # Complete the testing task
# ๐ Display statistics
print("๐ Task Management Dashboard\n")
# User statistics
for user in ['alice_dev', 'bob_manager', 'charlie_designer']:
stats = get_user_stats(user)
total, completed, in_progress, pending = stats
completion_rate = (completed / total * 100) if total > 0 else 0
print(f"๐ค {user}:")
print(f" ๐ Total: {total} | โ
Done: {completed} | ๐ Active: {in_progress}")
print(f" ๐ Completion rate: {completion_rate:.0f}%\n")
# Overdue tasks
cursor.execute('SELECT title, username, emoji FROM overdue_tasks')
overdue = cursor.fetchall()
if overdue:
print("โ ๏ธ Overdue Tasks:")
for title, user, emoji in overdue:
print(f" {emoji} {title} (assigned to: {user})")
else:
print("โ
No overdue tasks! Great job! ๐")
conn.commit()
conn.close()
๐ Key Takeaways
Youโve learned so much! Hereโs what you can now do:
- โ Create tables with proper structure using DDL ๐๏ธ
- โ Manipulate data efficiently with DML commands ๐ฆ
- โ Write safe queries that prevent SQL injection ๐ก๏ธ
- โ Use transactions for data integrity ๐
- โ Build real applications with SQLite and Python! ๐
Remember: SQL is like a superpower for data management. Master these basics, and youโll be building amazing data-driven applications! ๐ฆธโโ๏ธ
๐ค Next Steps
Congratulations! ๐ Youโve mastered SQL basics with Python!
Hereโs what to do next:
- ๐ป Practice with the task management exercise
- ๐๏ธ Build a small project using SQLite
- ๐ Learn about database normalization
- ๐ Explore PostgreSQL or MySQL for production apps
Your journey into database programming has just begun. Keep practicing, keep building, and most importantly, have fun with data! ๐
Happy coding with SQL and Python! ๐๐๐