Unit Testing SQL Queries Across Multiple Database Platforms

A practical guide with SQL Testing Library and type-safe contracts using Pydantic models that validate types, constraints.

Gurmeet Saran

Updated by

Kushal Thakkar

Jan. 14, 26 · Tutorial

Likes (3)

Comment

Save

8.8K Views

Testing SQL queries in production environments presents unique challenges that every data engineering team faces. When working with BigQuery, Snowflake, Redshift, Athena, or Trino, traditional testing approaches often fall short:

Fragile integration tests that break when production data changes
Slow feedback loops from running tests against full datasets
Silent failures during database engine upgrades that change SQL semantics
No type safety between SQL queries and Python code
Database migration challenges where SQL syntax differs across platforms
Complex setup requirements with different mocking strategies for each database

These challenges led to the development of SQL Testing Library - an open-source Python framework that enables fast, reliable unit testing of SQL queries with type-safe data contracts and mock data injection across BigQuery, Snowflake, Redshift, Athena, Trino, and DuckDB.

The Problem: Why SQL Testing Is Hard

Consider a typical data engineering scenario: You have ETL pipelines processing millions of records daily, complex analytical queries powering dashboards, and critical business logic implemented in SQL. How do you ensure these queries work correctly without:

Waiting minutes for tests to run against production-sized datasets
Dealing with flaky tests when upstream data changes
Having no contract between your SQL and Python code
Manually validating that SQL returns expected data types
Rewriting tests when migrating between database engines

Traditional approaches like creating dedicated test databases or using database-specific mocking tools solve some problems but introduce new complexity and maintenance overhead.

The Solution: Type-Safe Mock Data Injection

SQL Testing Library takes a different approach: it automatically injects mock data directly into your SQL queries using Common Table Expressions (CTEs) or temporary tables, then validates results against Pydantic data contracts, allowing you to test query logic with controlled datasets in seconds, not minutes.

Here's a simple example:

SQL

from dataclasses import dataclass
from typing import Optional
from pydantic import BaseModel, Field
from sql_testing_library import sql_test, TestCase, BaseMockTable

@dataclass
class User:
    user_id: int
    name: str
    email: str
    is_active: bool

class UserResult(BaseModel):
    """
    DATA CONTRACT: SQL must return exactly these types.

    Fields are REQUIRED by default - use Optional for nullable columns.
    """
    user_id: int  # Required
    name: str  # Required

class UsersMockTable(BaseMockTable):
    def get_database_name(self) -> str:
        return "test_db"

    def get_table_name(self) -> str:
        return "users"

@sql_test(
    adapter_type="duckdb",
    mock_tables=[
        UsersMockTable([
            User(1, "Alice", "[email protected]", True),
            User(2, "Bob", "[email protected]", False),
            User(3, "Charlie", "[email protected]", True)
        ])
    ],
    result_class=UserResult  # Type-safe validation
)
def test_active_users_query():
    return TestCase(
        query="""
            SELECT user_id, name
            FROM users
            WHERE is_active = TRUE
            ORDER BY user_id
        """,
        default_namespace="test_db"
    )

# Run the test - Pydantic validates results match UserResult contract
results = test_active_users_query()

# If SQL returned wrong types or missing fields, test would fail here ✅
assert len(results) == 2
assert results[0].name == "Alice"
assert results[1].name == "Charlie"egwg

What Makes This Powerful

Mock data injected via CTE - no test database needed
Pydantic validates results - if SQL returns wrong types, test fails immediately
Type-safe contract - UserResult defines exactly what SQL must return
Executes in <1 second - fast feedback loop

Key Features That Make SQL Testing Practical

Type-Safe Result Validation with Pydantic

The library uses Pydantic models to validate SQL results, creating a data contract between your queries and code:

    SQL
   
 

   from decimal import Decimal
from pydantic import BaseModel, Field
from typing import Optional, Literal

class OrderSummaryResult(BaseModel):
    """
    Data contract for order summary query.

    If SQL returns data that doesn't match this schema,
    validation fails immediately with a clear error message.
    """
    customer_id: int  # Required - must be present and int type
    total_orders: int  # Required
    total_amount: Decimal = Field(ge=0)  # Required, must be >= 0
    avg_order_value: Decimal  # Required
    customer_tier: Literal["Diamond", "Platinum", "Gold", "Silver"]  # Must be one of these
    last_order_date: Optional[date] = None  # Optional - can be NULL

# If SQL returns:
# - Wrong type (string instead of int)
# - Missing required field
# - Value violating constraint (negative total_amount)
# - Invalid enum value (tier = "Bronze")
# Test fails immediately with ValidationError ✅
  

Key Benefits

Required by default: All fields must be present unless marked Optional
Constraint validation: Field(ge=0, le=100) enforces business rules
Type checking: Catches int/str/Decimal mismatches
Enum validation:Literal ensures only valid values

2. Multi-Database Support with Consistent API

Write tests once, run them across different database platforms:

    SQL
   
 

   import pytest

@pytest.mark.parametrize("adapter_type", ["bigquery", "snowflake", "redshift", "duckdb"])
def test_customer_aggregation(adapter_type):
    @sql_test(
        adapter_type=adapter_type,
        mock_tables=[CustomersMockTable(test_data)],
        result_class=CustomerResult
    )
    def run_query():
        return TestCase(
            query="""
                SELECT
                    customer_id,
                    SUM(order_total) as total_spent,
                    COUNT(*) as order_count
                FROM orders
                GROUP BY customer_id
                HAVING total_spent > 1000
            """,
            default_namespace="analytics_db"
        )

    results = run_query()
    # Verify consistent behavior across all databases ✅
  

Database-Compatible Types with Mocksmith

Mocksmith provides database-specific types that map correctly to SQL schemas; a better approach than generic Python types:

    SQL
   
 

   from mocksmith import mockable, Varchar, Integer, Date, Boolean

@mockable
@dataclass
class Product:
    """
    Using database types instead of Python types.
    Think of these as dbt types for Python test data.
    """
    sku: Varchar(20)  # Maps to VARCHAR(20)
    name: Varchar(100)  # Maps to VARCHAR(100)
    quantity: Integer(ge=0, le=9999)  # Maps to SMALLINT with constraints
    price: Decimal  # Maps to DECIMAL/NUMERIC
    launch_date: Date()  # Maps to DATE
    is_active: Boolean()  # Maps to BOOLEAN

# Generate test data that matches your SQL schema
products = [Product.mock() for _ in range(100)]
# Mocksmith respects constraints: quantity always 0-9999 ✅
  

Complex Data Type Support

Test queries with comprehensive data type coverage:

    SQL
   
 

   from typing import List, Dict, Optional
from datetime import date, datetime

@dataclass
class ProductMetrics:
    product_id: int
    name: str
    price: Decimal  # DECIMAL(10,2)
    quantity: int

    # Date/Time types
    created_date: date
    last_updated: datetime

    # Complex types
    tags: List[str]  # ARRAY<STRING>
    attributes: Dict[str, str]  # MAP<STRING,STRING>

    # Optional types
    discount_percentage: Optional[float]  # Can be NULL
  

Automatic Query Size Handling

The library automatically switches between CTE injection (fast) and physical temporary tables when queries exceed database-specific size limits:

    SQL
   
 

   @sql_test(
    adapter_type="duckdb",
    mock_tables=[
        OrdersMockTable(large_dataset),  # 10,000 rows
        ProductsMockTable(product_catalog)  # 5,000 rows
    ],
    result_class=OrderAnalysis,
    use_physical_tables=True,  # Auto-detects or explicit
    max_workers=4  # Parallel table creation
)
def test_large_order_analysis():
    return TestCase(query=complex_analytical_query)
  

Example: Business Logic Validation

Let's look at a practical example: testing a customer segmentation query with business rules.

    SQL
   
 

   from mocksmith import mockable, Integer, Varchar, Date
from typing import Literal, Optional
import random

@mockable
@dataclass
class Customer:
    """Customer model with database types."""
    customer_id: Integer(ge=1000, le=9999)
    name: Varchar(100)
    email: Varchar(255)
    account_status: Literal["active", "suspended", "closed"]
    registration_date: Date()
    credit_score: Integer(ge=300, le=850)  # FICO score range

@mockable
@dataclass
class Transaction:
    """Transaction model."""
    transaction_id: Integer()
    customer_id: Integer(ge=1000, le=9999)
    amount: Decimal
    transaction_date: Date()

class CustomerSegmentResult(BaseModel):
    """
    Data contract for customer segmentation.

    Defines exact schema that SQL must return.
    """
    customer_id: int
    customer_name: str
    total_spent: Decimal = Field(ge=0)  # Cannot be negative
    transaction_count: int = Field(ge=0)
    customer_tier: Literal["Diamond", "Platinum", "Gold", "Silver"]
    days_active: int

# Generate controlled test data
customers = []
for i in range(50):
    customers.append(Customer.mock(customer_id=1000 + i))

transactions = []
for customer in customers:
    num_txns = random.randint(0, 15)
    for j in range(num_txns):
        transactions.append(Transaction.mock(
            customer_id=customer.customer_id,
            amount=Decimal(str(round(random.uniform(10, 1000), 2)))
        ))

@sql_test(
    adapter_type="duckdb",
    mock_tables=[CustomersMockTable(customers), TransactionsMockTable(transactions)],
    result_class=CustomerSegmentResult
)
def test_customer_segmentation():
    """
    Business Rules:
    - Diamond: $10k+ spent
    - Platinum: $5k-$10k
    - Gold: $1k-$5k
    - Silver: <$1k
    """
    return TestCase(
        query="""
            SELECT
                c.customer_id,
                c.name as customer_name,
                COALESCE(SUM(t.amount), 0) as total_spent,
                COUNT(t.transaction_id) as transaction_count,
                CASE
                    WHEN COALESCE(SUM(t.amount), 0) > 10000 THEN 'Diamond'
                    WHEN COALESCE(SUM(t.amount), 0) > 5000 THEN 'Platinum'
                    WHEN COALESCE(SUM(t.amount), 0) > 1000 THEN 'Gold'
                    ELSE 'Silver'
                END as customer_tier,
                DATEDIFF('day', CAST(c.registration_date AS DATE), CURRENT_DATE) as days_active
            FROM customers c
            LEFT JOIN transactions t ON c.customer_id = t.customer_id
            WHERE c.account_status = 'active'
            GROUP BY c.customer_id, c.name, c.registration_date
            ORDER BY total_spent DESC
            LIMIT 20
        """,
        default_namespace="test_db"
    )

# Execute test
results = test_customer_segmentation()

# Pydantic automatically validated:
# ✅ All required fields present
# ✅ total_spent >= 0 (Field constraint)
# ✅ customer_tier is valid Literal value
# ✅ All types correct

# Validate business logic relationships
for result in results:
    # Verify tiering logic (Pydantic can't check cross-field logic)
    if result.total_spent > 10000:
        assert result.customer_tier == "Diamond"
    elif result.total_spent > 5000:
        assert result.customer_tier == "Platinum"
    elif result.total_spent > 1000:
        assert result.customer_tier == "Gold"
    else:
        assert result.customer_tier == "Silver"

print("✅ All business rules validated!")
  

What Happened Here

Mock data injected 50 customers + transactions injected as CTEs
SQL executed against DuckDB with mock data
Pydantic validated each result checked against CustomerSegmentResult contract
Business logic validated explicit assertions for tiering rules
Fast execution runs entire test in <2 seconds

If SQL returned wrong data, Pydantic would catch it:

Missing field:ValidationError: Field required
Wrong type:ValidationError: Input should be a valid integer
Negative amount:ValidationError: Input should be greater than or equal to 0
Invalid tier:ValidationError: Input should be 'Diamond', 'Platinum', 'Gold' or 'Silver'

Real-World Use Cases

ETL Pipeline Testing

Test data transformations with controlled inputs and type-safe outputs.

    SQL
   
 

   class TransformedEvent(BaseModel):
    """Contract for transformed event data."""
    event_id: int
    user_id: int
    user_tier: Literal["high_value", "medium_value", "low_value"]
    lifetime_value: Decimal = Field(ge=0)

@sql_test(
    adapter_type="duckdb",
    mock_tables=[
        RawEventsMockTable(raw_events),
        UserDimensionMockTable(users)
    ],
    result_class=TransformedEvent  # Validates transformation output
)
def test_event_enrichment_pipeline():
    return TestCase(
        query="""
            SELECT
                e.event_id,
                e.user_id,
                CASE
                    WHEN u.lifetime_value > 1000 THEN 'high_value'
                    WHEN u.lifetime_value > 100 THEN 'medium_value'
                    ELSE 'low_value'
                END as user_tier,
                u.lifetime_value
            FROM raw_events e
            JOIN user_dimension u ON e.user_id = u.user_id
            WHERE e.event_type = 'purchase'
        """,
        default_namespace="analytics"
    )

# Pydantic validates:
# ✅ user_tier is valid Literal value
# ✅ lifetime_value >= 0
# ✅ All required fields present
  

Database Migration Testing

Verify that queries produce equivalent results across platforms.

    SQL
   
 

   @pytest.mark.parametrize("adapter", ["bigquery", "snowflake"])
def test_migration_equivalence(adapter):
    """Ensure query works identically on both platforms."""

    @sql_test(
        adapter_type=adapter,
        mock_tables=[OrdersMockTable(test_data)],
        result_class=AggregationResult
    )
    def run_query():
        return TestCase(
            query="SELECT customer_id, ARRAY_AGG(order_id) as orders FROM orders GROUP BY customer_id"
        )

    results = run_query()
    # Same data contract validates both platforms ✅
  

The Power of Dynamic Mock Data with Mocksmith

While SQL Testing Library focuses on SQL execution and validation, Mocksmith provides database-compatible test data that changes each run, uncovering edge cases static fixtures miss.

Static vs Dynamic Test Data

Traditional static approach:

    SQL
   
 

   # Static data - same values every test run
orders = [
    Order(1, Decimal("100.00"), "completed", date(2024, 1, 1), 101),
    Order(2, Decimal("200.00"), "completed", date(2024, 1, 2), 102),
    Order(3, Decimal("50.00"), "pending", date(2024, 1, 3), 103),
]
  

Problems with This Approach

Tedious to create
Always tests same scenarios
Misses edge cases
Not realistic

Mocksmith dynamic approach:

    SQL
   
 

   from mocksmith import mockable, Integer, Date
from typing import Literal

@mockable
@dataclass
class Order:
    order_id: Integer()
    amount: Decimal  # Set manually to include refunds
    status: Literal["pending", "completed", "cancelled", "refunded"]
    order_date: Date()
    customer_id: Integer(ge=100, le=9999)

# Generate 100 orders with varying amounts
orders = []
for i in range(100):
    # 10% chance of refund (negative)
    amount = Decimal(str(round(random.uniform(-100, 1000), 2)))
    orders.append(Order.mock(
        order_id=i + 1,
        amount=amount
        # Mocksmith auto-generates: status, order_date, customer_id with constraints
    ))

# Each test run gets different data:
# Run 1: Order(1, Decimal("457.23"), "completed", date(2023, 5, 12), 5678)
# Run 2: Order(2, Decimal("-23.91"), "refunded", date(2024, 2, 28), 1234)
# Run 3: Order(3, Decimal("0.01"), "pending", date(2023, 12, 31), 9999)
  

Benefits

Different data each run finds edge cases automatically
Realistic dates, values, and distributions
Constraints enforced (customer_id always 100-9999)
Minimal code (Mocksmith handles most fields)

Real Bug Found by Random Data

    SQL
   
 

   # Bug in SQL query (looks fine with static positive data)
query = """
    SELECT
        category,
        SUM(amount) / COUNT(*) as avg_transaction
    FROM transactions
    GROUP BY category
"""

# Static test data (all positive) - query seems fine ✅
# Mocksmith generates refunds (negative amounts) - reveals bug! ❌
# Fix: Use AVG(amount) or handle refunds separately
  

What Mocksmith's Random Data Uncovered

Division by zero with certain distributions
Negative amount handling bugs
Date boundary issues
Null handling problems

Getting Started

Installation

    Plain Text
   
   # Install with DuckDB for local testing (no setup required!)
pip install sql-testing-library[duckdb]

# Install Mocksmith for database-compatible test data
pip install mocksmith

# Or install with your production database
pip install sql-testing-library[bigquery]
pip install sql-testing-library[snowflake]
pip install sql-testing-library[all]  # All adapters

Configuration

Create pytest.ini:

    Plain Text
   
 

   [sql_testing]
adapter = duckdb  # Fast local testing

# DuckDB requires no configuration - uses in-memory database

# Optional: Configure production database for integration tests
[sql_testing.bigquery]
project_id = your-project
dataset_id = test_dataset
credentials_path = /path/to/credentials.json
  

Your First Test

    SQL
   
 

   from pydantic import BaseModel, Field
from decimal import Decimal
from typing import Optional

class OrderTotal(BaseModel):
    """Data contract - SQL must return this exact schema."""
    customer_id: int  # Required
    total: Decimal = Field(ge=0)  # Required, non-negative
    order_count: int  # Required
    last_order_date: Optional[date] = None  # Optional - can be NULL

@sql_test(
    adapter_type="duckdb",
    mock_tables=[OrdersMockTable(test_orders)],
    result_class=OrderTotal  # Contract enforcement
)
def test_order_totals():
    return TestCase(
        query="""
            SELECT
                customer_id,
                SUM(amount) as total,
                COUNT(*) as order_count,
                MAX(order_date) as last_order_date
            FROM orders
            WHERE status = 'completed'
            GROUP BY customer_id
        """,
        default_namespace="test_db"
    )

# Run test
results = test_order_totals()

# Pydantic already validated types and constraints
# Just validate business logic relationships:
for result in results:
    assert result.total == result.order_count * result.avg  # If you had avg field
  

Now, run with:

pytest test_orders.py -v

What Value Does SQL Testing Library Provide?

Data Contracts for SQL Queries

Pydantic models define exact schemas that SQL must return — types, constraints, required vs optional fields. No more guessing.

Bugs Caught at Deserialization Time

Type mismatches, constraint violations, and missing fields are caught immediately when SQL results → Python objects. Not in production.

Database-Compatible Mock Data with Constraints

Mocksmith provides SQL-compatible types (Varchar, Integer, Date) that map to database schemas. Constraints like Integer(ge=0, le=255) are automatically respected during data generation, ensuring test data matches your database schema rules.

    Plain Text
   
 

   @mockable
@dataclass
class Product:
    sku: Varchar(20)  # Maps to VARCHAR(20)
    quantity: Integer(ge=0, le=9999)  # Auto-generates 0-9999, validates constraints
    credit_score: Integer(ge=300, le=850)  # Auto-generates realistic FICO scores

# Mocksmith respects constraints automatically ✅
products = [Product.mock() for _ in range(100)]
# All quantity values guaranteed to be 0-9999
# All credit_score values guaranteed to be 300-850
  

Self-Documenting Code

Result models serve as documentation.

    Plain Text
   
 

   class Report(BaseModel):
    """Clear documentation of what SQL returns."""
    revenue: Decimal = Field(ge=0, description="Total revenue in USD")
    growth_rate: Decimal = Field(description="YoY growth percentage")
    region: Literal["NA", "EU", "APAC"]  # Valid regions
    last_updated: Optional[datetime] = None  # Can be NULL
  

Refactoring Confidence

Change SQL queries safely. So if you break the contract — rename field, change type, remove required field — tests fail immediately.

Fast Local Testing

Test with DuckDB locally in seconds; no cloud setup, no waiting, no costs. Deploy to production databases when ready.

Multi-Database Portability

Write tests once with data contracts, run on any database. Same Pydantic validation works across BigQuery, Snowflake, Redshift, Athena, Trino, DuckDB.

Conclusion

SQL Testing Library transforms SQL testing from hoping queries work to enforcing type-safe contracts. By combining mock data injection with Pydantic validation and Mocksmith's database types, you get:

Data contracts that SQL results must match exactly
Automatic validation catches bugs at deserialization time
Fast feedback with local testing in seconds
Database portability across 6+ platforms
Realistic test data from Mocksmith with constraint enforcement

Stop hoping your SQL is correct. Enforce it with type-safe contracts.

Start with DuckDB (no setup required), test locally, ship confidently.

Database Test data sql

Opinions expressed by DZone contributors are their own.

Related

Trending