Stephen posted a good API architecture question in a LinkedIn group,
So your API management initiative is a success. Now how do you cope with the unpredictable volumes? Throttle requests over static infrastructure or make your next initiative “elastic” infrastructure?
Static API throttling and elastic API scaling are complementary design techniques.
Static API Throttling
Static throttle limits ensure that you can limit aggregate maximum demand and rightsize back-end capacity for worst-case loads. API gateways enforce a choke point that can shape traffic volume and ensure back-end target service infrastructure is not overwhelmed. Throttle limits may be defined per user and per API, and API traffic routing rules can allocate demand across multiple static service infrastructure environments.
DevOps teams rely on usage monitoring and alerts to help them proactively scale infrastructure components. With adequate lead time, teams can manually provision additional capacity and scale static environments.
Elastic API Scaling
Elastic infrastructure reduces the lead time required to scale infrastructure up and down, and meet demand. Elastic infrastructure increases IT resource efficiency and reducing capacity delivery timelines. When designing an elastic API platform, consider interactions between API consumers, API delivery networks, elastic load balancers, API gateways, enterprise service bus mediators, and service hosts.
Static or elastic?
Static and elastic are complimentary. Static throttle tiers enable teams to offer their API as a well-defined product offering with expected operating margins. Subscription-based usage monitoring helps teams track API usage per customer and understand how to best monetize the API through SLA tiers and charges. Elastic scaling enables teams to open up new API consumer channels "at the speed of now" and offer your business as a service while minimizing IT infrastructure delivery timeline constraints.