The Serverless Ceiling: Designing Write-Heavy Backends With Aurora Limitless

Break the single-writer bottleneck by aligning AWS Lambda, RDS Proxy, and the Aurora Limitless router into a cohesive architecture.

Nabin Debnath

Jan. 28, 26 · Analysis

Likes (2)

Comment

Save

3.9K Views

For years, serverless architectures have solved one half of the scalability problem.

Compute is no longer the bottleneck. Platforms like AWS Lambda can absorb sudden traffic spikes without advance provisioning. But the moment the compute layer needs to persist data in a relational database, the model starts to strain. Thousands of concurrent functions quickly converge on a single write endpoint, and what looked like elastic scale turns into contention.

This gap has traditionally forced difficult trade-offs. Teams either moved to key-value stores and redesigned their access patterns, or they implemented application-level sharding — pushing database routing into business code and accepting operational complexity.

Amazon Aurora Limitless Database introduces a third option: managed horizontal sharding for PostgreSQL. It removes the need for application-managed shards while preserving SQL semantics. But it does not remove the need for architectural discipline.

Aurora Limitless behaves very differently from a single-node database. Treating it as “just PostgreSQL, but bigger” leads to higher latency, higher cost, and harder debugging. This article explains how to design for those differences using a pattern that works well with serverless compute:

Lambda → RDS Proxy → Aurora Limitless

Understanding the New Topology

In a standard Aurora cluster, applications connect to a primary writer instance. Aurora Limitless removes that concept. Instead, applications connect to a Transaction Router. The router is a stateless fleet that parses incoming SQL and determines where it should execute. Every query takes one of two paths.

Fast Path: Single-Shard Execution

If the query includes a shard key predicate, for example:

    SQL
   
   WHERE account_id = '123'

The router can deterministically route the request to a single shard. Execution is local, predictable, and scales linearly.

Slow Path: Scatter-Gather

If the query does not include the shard key, the router must broadcast it to all shards, wait for responses, merge results, and return them to the client. The architectural objective with Aurora Limitless is straightforward: design schemas and queries so that most requests take the fast path.

Why RDS Proxy Is Not Optional

Serverless compute introduces bursty connection behavior. A sudden traffic surge can create thousands of concurrent Lambda invocations in seconds. Without a connection governor, those invocations attempt to establish thousands of TLS connections directly to the Transaction Router.

This is where systems fail — not because queries are slow, but because connection management overwhelms the router.

RDS Proxy addresses this by multiplexing many logical client connections onto a smaller pool of persistent backend connections. Twenty thousand Lambda invocations can be reduced to dozens of active database connections.

Without RDS Proxy, Aurora Limitless becomes vulnerable to connection storms. With it, the router can focus on query routing rather than socket management.

The Pinning Trap

RDS Proxy relies on connection reuse. That reuse breaks if the application modifies session-level state.

For example:

    JavaScript
   
   // Avoid this inside request handlers
await client.query("SET search_path TO my_schema");
await client.query("SET timezone TO 'UTC'");

When session state changes, RDS Proxy must pin that client to a dedicated backend connection. Pin enough clients, and multiplexing disappears. At scale, this results in connection exhaustion and instability.

Rule: All session configuration must be defined in RDS Proxy initialization queries or database parameter groups. Never issue SET commands inside Lambda handlers.

Schema Design for Shard Locality

Aurora Limitless introduces explicit table modes. Choosing the right one determines whether queries stay local or fan out.

Sharded Tables

High-volume tables, such as transactions, events, or logs, should be sharded. The shard key must be part of the primary key.

    SQL
   
 

   SET rds_aurora.limitless_create_table_mode = 'sharded';
SET rds_aurora.limitless_create_table_shard_key = '{"account_id"}';

CREATE TABLE transactions (
    transaction_id BIGSERIAL,
    account_id UUID NOT NULL,
    amount DECIMAL(19,4),
    created_at TIMESTAMP DEFAULT NOW(),
    PRIMARY KEY (account_id, transaction_id)
);
  

This guarantees that all rows for a given account reside on the same shard.

Reference Tables

Small, relatively static datasets such as currency codes or country lists should be defined as reference tables. These are replicated to every shard, allowing joins to remain local. Without reference tables, even simple joins introduce cross-shard network traffic.

Enforcing Local Queries in Application Code

Application code must respect shard boundaries. Every query should include the shard key whenever possible. Below is a Node.js Lambda example aligned with that constraint:

    JavaScript
   
 

   const { Pool } = require('pg');

const pool = new Pool({
  host: process.env.DB_ENDPOINT,
  ssl: { rejectUnauthorized: true }
});

exports.handler = async (event) => {
  const { accountId, amount, currency } = JSON.parse(event.body);
  const client = await pool.connect();

  try {
    await client.query(
      `INSERT INTO transactions (account_id, amount, currency_code)
       VALUES ($1, $2, $3)`,
      [accountId, amount, currency]
    );

    // Join remains local because currencies is a reference table
    const result = await client.query(
      `SELECT t.amount, c.exchange_rate
       FROM transactions t
       JOIN currencies c ON t.currency_code = c.currency_code
       WHERE t.account_id = $1
       ORDER BY t.created_at DESC
       LIMIT 5`,
      [accountId]
    );

    return { statusCode: 200, body: JSON.stringify(result.rows) };
  } finally {
    client.release();
  }
};
  

Observability: Stop Trusting Averages

Once the code is deployed, the challenge shifts to operations. In a single-node database, CPUUtilization is often a reliable signal. In Aurora Limitless, it is not. The system introduces two independent compute layers, and each fails for different reasons. Both must be observed separately.

Transaction Router Metrics: High router CPU with low shard CPU usually indicates:

Connection storms (missing or misconfigured RDS Proxy)
TLS handshake pressure
Session pinning preventing connection reuse

In this case, scaling shards will not help. The bottleneck is routing and connection management.

Shard Group Metrics: High CPU on one shard with low utilization on others indicates a hot shard. This almost always points to a poor shard key choice (for example, timestamp-based keys or low-cardinality values).

Actionable Rule: Do not monitor cluster-wide averages. Build dashboards that explicitly separate router CPU and per-shard CPU. Averages hide the exact failure modes Limitless introduces.

The Cost Model Trap: Scatter-Gather Multiplies Your Bill

Performance is not the only operational risk — billing is the other. Aurora Limitless uses a serverless pricing model based on Aurora Capacity Units (ACUs). What’s easy to miss is where those ACUs are consumed. A scatter-gather query does not just run slower; it consumes compute on every shard involved. For example:

    SQL
   
   SELECT * FROM orders WHERE status = 'FAILED';

In a monolithic database, this is a single index scan. In Aurora Limitless, the router must broadcast the query to all shards, execute it on each shard, and aggregate results centrally. If your cluster has N shards, that query costs roughly N times more compute than a shard-local query.

Actionable Rule: Audit query logs for statements that do not include the shard key. In Aurora Limitless, these are not just performance smells — they are billing risks.

Distributed Sequences: IDs Are Unique, Not Ordered

In a single PostgreSQL instance, BIGSERIAL values increase monotonically. Aurora Limitless intentionally breaks this assumption to avoid global coordination. Each shard is allocated ranges of sequence values independently.

This means a later insert can receive a lower ID than an earlier insert. Ordering by ID no longer represents time.

Safe Alternative: Always use a timestamp column (for example, created_at) for ordering, pagination, and recency queries.

Rule: Treat sequence-generated IDs as identifiers only — never as a proxy for insertion order.

Conclusion

Aurora Limitless closes a long-standing gap between elastic compute and relational persistence. It allows SQL-based systems to scale beyond the constraints of a single writer without forcing application-managed sharding.

That benefit comes with responsibility. Schema design, connection management, and query patterns directly determine whether the system scales efficiently or becomes an expensive bottleneck.

If you respect shard locality, govern connections, and design with the router in mind, Aurora Limitless enables relational backends that scale with serverless workloads. If not, it simply distributes the bottleneck across more machines.

Database Relational database Aurora (protocol) Shard (database architecture)

Opinions expressed by DZone contributors are their own.

Related

Trending