Unit Testing SQL Queries Across Multiple Database Platforms
A practical guide with SQL Testing Library and type-safe contracts using Pydantic models that validate types, constraints.
Join the DZone community and get the full member experience.
Join For FreeTesting SQL queries in production environments presents unique challenges that every data engineering team faces. When working with BigQuery, Snowflake, Redshift, Athena, or Trino, traditional testing approaches often fall short:
- Fragile integration tests that break when production data changes
- Slow feedback loops from running tests against full datasets
- Silent failures during database engine upgrades that change SQL semantics
- No type safety between SQL queries and Python code
- Database migration challenges where SQL syntax differs across platforms
- Complex setup requirements with different mocking strategies for each database
These challenges led to the development of SQL Testing Library - an open-source Python framework that enables fast, reliable unit testing of SQL queries with type-safe data contracts and mock data injection across BigQuery, Snowflake, Redshift, Athena, Trino, and DuckDB.
The Problem: Why SQL Testing Is Hard
Consider a typical data engineering scenario: You have ETL pipelines processing millions of records daily, complex analytical queries powering dashboards, and critical business logic implemented in SQL. How do you ensure these queries work correctly without:
- Waiting minutes for tests to run against production-sized datasets
- Dealing with flaky tests when upstream data changes
- Having no contract between your SQL and Python code
- Manually validating that SQL returns expected data types
- Rewriting tests when migrating between database engines
Traditional approaches like creating dedicated test databases or using database-specific mocking tools solve some problems but introduce new complexity and maintenance overhead.
The Solution: Type-Safe Mock Data Injection
SQL Testing Library takes a different approach: it automatically injects mock data directly into your SQL queries using Common Table Expressions (CTEs) or temporary tables, then validates results against Pydantic data contracts, allowing you to test query logic with controlled datasets in seconds, not minutes.
Here's a simple example:
from dataclasses import dataclass
from typing import Optional
from pydantic import BaseModel, Field
from sql_testing_library import sql_test, TestCase, BaseMockTable
@dataclass
class User:
user_id: int
name: str
email: str
is_active: bool
class UserResult(BaseModel):
"""
DATA CONTRACT: SQL must return exactly these types.
Fields are REQUIRED by default - use Optional for nullable columns.
"""
user_id: int # Required
name: str # Required
class UsersMockTable(BaseMockTable):
def get_database_name(self) -> str:
return "test_db"
def get_table_name(self) -> str:
return "users"
@sql_test(
adapter_type="duckdb",
mock_tables=[
UsersMockTable([
User(1, "Alice", "[email protected]", True),
User(2, "Bob", "[email protected]", False),
User(3, "Charlie", "[email protected]", True)
])
],
result_class=UserResult # Type-safe validation
)
def test_active_users_query():
return TestCase(
query="""
SELECT user_id, name
FROM users
WHERE is_active = TRUE
ORDER BY user_id
""",
default_namespace="test_db"
)
# Run the test - Pydantic validates results match UserResult contract
results = test_active_users_query()
# If SQL returned wrong types or missing fields, test would fail here ✅
assert len(results) == 2
assert results[0].name == "Alice"
assert results[1].name == "Charlie"egwg
What Makes This Powerful
- Mock data injected via CTE - no test database needed
- Pydantic validates results - if SQL returns wrong types, test fails immediately
- Type-safe contract -
UserResultdefines exactly what SQL must return - Executes in <1 second - fast feedback loop
Key Features That Make SQL Testing Practical
Type-Safe Result Validation with Pydantic
The library uses Pydantic models to validate SQL results, creating a data contract between your queries and code:
from decimal import Decimal
from pydantic import BaseModel, Field
from typing import Optional, Literal
class OrderSummaryResult(BaseModel):
"""
Data contract for order summary query.
If SQL returns data that doesn't match this schema,
validation fails immediately with a clear error message.
"""
customer_id: int # Required - must be present and int type
total_orders: int # Required
total_amount: Decimal = Field(ge=0) # Required, must be >= 0
avg_order_value: Decimal # Required
customer_tier: Literal["Diamond", "Platinum", "Gold", "Silver"] # Must be one of these
last_order_date: Optional[date] = None # Optional - can be NULL
# If SQL returns:
# - Wrong type (string instead of int)
# - Missing required field
# - Value violating constraint (negative total_amount)
# - Invalid enum value (tier = "Bronze")
# Test fails immediately with ValidationError ✅
Key Benefits
- Required by default: All fields must be present unless marked
Optional - Constraint validation:
Field(ge=0, le=100)enforces business rules - Type checking: Catches int/str/Decimal mismatches
- Enum validation:
Literalensures only valid values
2. Multi-Database Support with Consistent API
Write tests once, run them across different database platforms:
import pytest
@pytest.mark.parametrize("adapter_type", ["bigquery", "snowflake", "redshift", "duckdb"])
def test_customer_aggregation(adapter_type):
@sql_test(
adapter_type=adapter_type,
mock_tables=[CustomersMockTable(test_data)],
result_class=CustomerResult
)
def run_query():
return TestCase(
query="""
SELECT
customer_id,
SUM(order_total) as total_spent,
COUNT(*) as order_count
FROM orders
GROUP BY customer_id
HAVING total_spent > 1000
""",
default_namespace="analytics_db"
)
results = run_query()
# Verify consistent behavior across all databases ✅
Database-Compatible Types with Mocksmith
Mocksmith provides database-specific types that map correctly to SQL schemas; a better approach than generic Python types:
from mocksmith import mockable, Varchar, Integer, Date, Boolean
@mockable
@dataclass
class Product:
"""
Using database types instead of Python types.
Think of these as dbt types for Python test data.
"""
sku: Varchar(20) # Maps to VARCHAR(20)
name: Varchar(100) # Maps to VARCHAR(100)
quantity: Integer(ge=0, le=9999) # Maps to SMALLINT with constraints
price: Decimal # Maps to DECIMAL/NUMERIC
launch_date: Date() # Maps to DATE
is_active: Boolean() # Maps to BOOLEAN
# Generate test data that matches your SQL schema
products = [Product.mock() for _ in range(100)]
# Mocksmith respects constraints: quantity always 0-9999 ✅
Complex Data Type Support
Test queries with comprehensive data type coverage:
from typing import List, Dict, Optional
from datetime import date, datetime
@dataclass
class ProductMetrics:
product_id: int
name: str
price: Decimal # DECIMAL(10,2)
quantity: int
# Date/Time types
created_date: date
last_updated: datetime
# Complex types
tags: List[str] # ARRAY<STRING>
attributes: Dict[str, str] # MAP<STRING,STRING>
# Optional types
discount_percentage: Optional[float] # Can be NULL
Automatic Query Size Handling
The library automatically switches between CTE injection (fast) and physical temporary tables when queries exceed database-specific size limits:
@sql_test(
adapter_type="duckdb",
mock_tables=[
OrdersMockTable(large_dataset), # 10,000 rows
ProductsMockTable(product_catalog) # 5,000 rows
],
result_class=OrderAnalysis,
use_physical_tables=True, # Auto-detects or explicit
max_workers=4 # Parallel table creation
)
def test_large_order_analysis():
return TestCase(query=complex_analytical_query)
Example: Business Logic Validation
Let's look at a practical example: testing a customer segmentation query with business rules.
from mocksmith import mockable, Integer, Varchar, Date
from typing import Literal, Optional
import random
@mockable
@dataclass
class Customer:
"""Customer model with database types."""
customer_id: Integer(ge=1000, le=9999)
name: Varchar(100)
email: Varchar(255)
account_status: Literal["active", "suspended", "closed"]
registration_date: Date()
credit_score: Integer(ge=300, le=850) # FICO score range
@mockable
@dataclass
class Transaction:
"""Transaction model."""
transaction_id: Integer()
customer_id: Integer(ge=1000, le=9999)
amount: Decimal
transaction_date: Date()
class CustomerSegmentResult(BaseModel):
"""
Data contract for customer segmentation.
Defines exact schema that SQL must return.
"""
customer_id: int
customer_name: str
total_spent: Decimal = Field(ge=0) # Cannot be negative
transaction_count: int = Field(ge=0)
customer_tier: Literal["Diamond", "Platinum", "Gold", "Silver"]
days_active: int
# Generate controlled test data
customers = []
for i in range(50):
customers.append(Customer.mock(customer_id=1000 + i))
transactions = []
for customer in customers:
num_txns = random.randint(0, 15)
for j in range(num_txns):
transactions.append(Transaction.mock(
customer_id=customer.customer_id,
amount=Decimal(str(round(random.uniform(10, 1000), 2)))
))
@sql_test(
adapter_type="duckdb",
mock_tables=[CustomersMockTable(customers), TransactionsMockTable(transactions)],
result_class=CustomerSegmentResult
)
def test_customer_segmentation():
"""
Business Rules:
- Diamond: $10k+ spent
- Platinum: $5k-$10k
- Gold: $1k-$5k
- Silver: <$1k
"""
return TestCase(
query="""
SELECT
c.customer_id,
c.name as customer_name,
COALESCE(SUM(t.amount), 0) as total_spent,
COUNT(t.transaction_id) as transaction_count,
CASE
WHEN COALESCE(SUM(t.amount), 0) > 10000 THEN 'Diamond'
WHEN COALESCE(SUM(t.amount), 0) > 5000 THEN 'Platinum'
WHEN COALESCE(SUM(t.amount), 0) > 1000 THEN 'Gold'
ELSE 'Silver'
END as customer_tier,
DATEDIFF('day', CAST(c.registration_date AS DATE), CURRENT_DATE) as days_active
FROM customers c
LEFT JOIN transactions t ON c.customer_id = t.customer_id
WHERE c.account_status = 'active'
GROUP BY c.customer_id, c.name, c.registration_date
ORDER BY total_spent DESC
LIMIT 20
""",
default_namespace="test_db"
)
# Execute test
results = test_customer_segmentation()
# Pydantic automatically validated:
# ✅ All required fields present
# ✅ total_spent >= 0 (Field constraint)
# ✅ customer_tier is valid Literal value
# ✅ All types correct
# Validate business logic relationships
for result in results:
# Verify tiering logic (Pydantic can't check cross-field logic)
if result.total_spent > 10000:
assert result.customer_tier == "Diamond"
elif result.total_spent > 5000:
assert result.customer_tier == "Platinum"
elif result.total_spent > 1000:
assert result.customer_tier == "Gold"
else:
assert result.customer_tier == "Silver"
print("✅ All business rules validated!")
What Happened Here
- Mock data injected 50 customers + transactions injected as CTEs
- SQL executed against DuckDB with mock data
- Pydantic validated each result checked against
CustomerSegmentResultcontract - Business logic validated explicit assertions for tiering rules
- Fast execution runs entire test in <2 seconds
If SQL returned wrong data, Pydantic would catch it:
- Missing field:
ValidationError: Field required - Wrong type:
ValidationError: Input should be a valid integer - Negative amount:
ValidationError: Input should be greater than or equal to 0 - Invalid tier:
ValidationError: Input should be 'Diamond', 'Platinum', 'Gold' or 'Silver'
Real-World Use Cases
ETL Pipeline Testing
Test data transformations with controlled inputs and type-safe outputs.
class TransformedEvent(BaseModel):
"""Contract for transformed event data."""
event_id: int
user_id: int
user_tier: Literal["high_value", "medium_value", "low_value"]
lifetime_value: Decimal = Field(ge=0)
@sql_test(
adapter_type="duckdb",
mock_tables=[
RawEventsMockTable(raw_events),
UserDimensionMockTable(users)
],
result_class=TransformedEvent # Validates transformation output
)
def test_event_enrichment_pipeline():
return TestCase(
query="""
SELECT
e.event_id,
e.user_id,
CASE
WHEN u.lifetime_value > 1000 THEN 'high_value'
WHEN u.lifetime_value > 100 THEN 'medium_value'
ELSE 'low_value'
END as user_tier,
u.lifetime_value
FROM raw_events e
JOIN user_dimension u ON e.user_id = u.user_id
WHERE e.event_type = 'purchase'
""",
default_namespace="analytics"
)
# Pydantic validates:
# ✅ user_tier is valid Literal value
# ✅ lifetime_value >= 0
# ✅ All required fields present
Database Migration Testing
Verify that queries produce equivalent results across platforms.
@pytest.mark.parametrize("adapter", ["bigquery", "snowflake"])
def test_migration_equivalence(adapter):
"""Ensure query works identically on both platforms."""
@sql_test(
adapter_type=adapter,
mock_tables=[OrdersMockTable(test_data)],
result_class=AggregationResult
)
def run_query():
return TestCase(
query="SELECT customer_id, ARRAY_AGG(order_id) as orders FROM orders GROUP BY customer_id"
)
results = run_query()
# Same data contract validates both platforms ✅
The Power of Dynamic Mock Data with Mocksmith
While SQL Testing Library focuses on SQL execution and validation, Mocksmith provides database-compatible test data that changes each run, uncovering edge cases static fixtures miss.
Static vs Dynamic Test Data
Traditional static approach:
# Static data - same values every test run
orders = [
Order(1, Decimal("100.00"), "completed", date(2024, 1, 1), 101),
Order(2, Decimal("200.00"), "completed", date(2024, 1, 2), 102),
Order(3, Decimal("50.00"), "pending", date(2024, 1, 3), 103),
]
Problems with This Approach
- Tedious to create
- Always tests same scenarios
- Misses edge cases
- Not realistic
Mocksmith dynamic approach:
from mocksmith import mockable, Integer, Date
from typing import Literal
@mockable
@dataclass
class Order:
order_id: Integer()
amount: Decimal # Set manually to include refunds
status: Literal["pending", "completed", "cancelled", "refunded"]
order_date: Date()
customer_id: Integer(ge=100, le=9999)
# Generate 100 orders with varying amounts
orders = []
for i in range(100):
# 10% chance of refund (negative)
amount = Decimal(str(round(random.uniform(-100, 1000), 2)))
orders.append(Order.mock(
order_id=i + 1,
amount=amount
# Mocksmith auto-generates: status, order_date, customer_id with constraints
))
# Each test run gets different data:
# Run 1: Order(1, Decimal("457.23"), "completed", date(2023, 5, 12), 5678)
# Run 2: Order(2, Decimal("-23.91"), "refunded", date(2024, 2, 28), 1234)
# Run 3: Order(3, Decimal("0.01"), "pending", date(2023, 12, 31), 9999)
Benefits
- Different data each run finds edge cases automatically
- Realistic dates, values, and distributions
- Constraints enforced (
customer_idalways 100-9999) - Minimal code (Mocksmith handles most fields)
Real Bug Found by Random Data
# Bug in SQL query (looks fine with static positive data)
query = """
SELECT
category,
SUM(amount) / COUNT(*) as avg_transaction
FROM transactions
GROUP BY category
"""
# Static test data (all positive) - query seems fine ✅
# Mocksmith generates refunds (negative amounts) - reveals bug! ❌
# Fix: Use AVG(amount) or handle refunds separately
What Mocksmith's Random Data Uncovered
- Division by zero with certain distributions
- Negative amount handling bugs
- Date boundary issues
- Null handling problems
Getting Started
Installation
# Install with DuckDB for local testing (no setup required!)
pip install sql-testing-library[duckdb]
# Install Mocksmith for database-compatible test data
pip install mocksmith
# Or install with your production database
pip install sql-testing-library[bigquery]
pip install sql-testing-library[snowflake]
pip install sql-testing-library[all] # All adapters
Configuration
Create pytest.ini:
[sql_testing]
adapter = duckdb # Fast local testing
# DuckDB requires no configuration - uses in-memory database
# Optional: Configure production database for integration tests
[sql_testing.bigquery]
project_id = your-project
dataset_id = test_dataset
credentials_path = /path/to/credentials.json
Your First Test
from pydantic import BaseModel, Field
from decimal import Decimal
from typing import Optional
class OrderTotal(BaseModel):
"""Data contract - SQL must return this exact schema."""
customer_id: int # Required
total: Decimal = Field(ge=0) # Required, non-negative
order_count: int # Required
last_order_date: Optional[date] = None # Optional - can be NULL
@sql_test(
adapter_type="duckdb",
mock_tables=[OrdersMockTable(test_orders)],
result_class=OrderTotal # Contract enforcement
)
def test_order_totals():
return TestCase(
query="""
SELECT
customer_id,
SUM(amount) as total,
COUNT(*) as order_count,
MAX(order_date) as last_order_date
FROM orders
WHERE status = 'completed'
GROUP BY customer_id
""",
default_namespace="test_db"
)
# Run test
results = test_order_totals()
# Pydantic already validated types and constraints
# Just validate business logic relationships:
for result in results:
assert result.total == result.order_count * result.avg # If you had avg field
Now, run with:
pytest test_orders.py -v
What Value Does SQL Testing Library Provide?
Data Contracts for SQL Queries
Pydantic models define exact schemas that SQL must return — types, constraints, required vs optional fields. No more guessing.
Bugs Caught at Deserialization Time
Type mismatches, constraint violations, and missing fields are caught immediately when SQL results → Python objects. Not in production.
Database-Compatible Mock Data with Constraints
Mocksmith provides SQL-compatible types (Varchar, Integer, Date) that map to database schemas. Constraints like Integer(ge=0, le=255) are automatically respected during data generation, ensuring test data matches your database schema rules.
@mockable
@dataclass
class Product:
sku: Varchar(20) # Maps to VARCHAR(20)
quantity: Integer(ge=0, le=9999) # Auto-generates 0-9999, validates constraints
credit_score: Integer(ge=300, le=850) # Auto-generates realistic FICO scores
# Mocksmith respects constraints automatically ✅
products = [Product.mock() for _ in range(100)]
# All quantity values guaranteed to be 0-9999
# All credit_score values guaranteed to be 300-850
Self-Documenting Code
Result models serve as documentation.
class Report(BaseModel):
"""Clear documentation of what SQL returns."""
revenue: Decimal = Field(ge=0, description="Total revenue in USD")
growth_rate: Decimal = Field(description="YoY growth percentage")
region: Literal["NA", "EU", "APAC"] # Valid regions
last_updated: Optional[datetime] = None # Can be NULL
Refactoring Confidence
Change SQL queries safely. So if you break the contract — rename field, change type, remove required field — tests fail immediately.
Fast Local Testing
Test with DuckDB locally in seconds; no cloud setup, no waiting, no costs. Deploy to production databases when ready.
Multi-Database Portability
Write tests once with data contracts, run on any database. Same Pydantic validation works across BigQuery, Snowflake, Redshift, Athena, Trino, DuckDB.
Conclusion
SQL Testing Library transforms SQL testing from hoping queries work to enforcing type-safe contracts. By combining mock data injection with Pydantic validation and Mocksmith's database types, you get:
- Data contracts that SQL results must match exactly
- Automatic validation catches bugs at deserialization time
- Fast feedback with local testing in seconds
- Database portability across 6+ platforms
- Realistic test data from Mocksmith with constraint enforcement
Stop hoping your SQL is correct. Enforce it with type-safe contracts.
Start with DuckDB (no setup required), test locally, ship confidently.
Opinions expressed by DZone contributors are their own.
Comments