Understanding API Rate-Limiting Techniques in Zato
Enabling rate-limiting in Zato means that access to Zato-based APIs can be throttled per endpoint, user, or service.
Join the DZone community and get the full member experience.
Join For FreeEnabling rate-limiting in Zato means that access to Zato-based APIs can be throttled per endpoint, user, or service — including options to make limits apply to specific IP addresses only — and if limits are exceeded within a selected period of time, the invocation will fail. Let's check how to use it all.
Where and When Limits Apply
API rate limiting works on several levels, and the configuration is always checked in the order below, which follows from the narrowest, most specific parts of the system (endpoints), through users which may apply to multiple endpoints, up to services which in turn may be used by both multiple endpoints and users.
- First, per-endpoint limits
- Then, per-user limits
- Finally, per-service limits
When a request arrives through an endpoint, that endpoint's rate limiting configuration is checked. If the limit is already reached for the calling application's IP address or network, the request is rejected.
Next, if any user is associated with the endpoint, that account's rate limits are checked in the same manner, and, similarly, if they are reached, the request is rejected.
Finally, if the endpoint's underlying service is configured to do so, it also checks if its invocation limits are not exceeded, rejecting the message accordingly if they are.
Note that the three levels are distinct, yet they overlap in what they allow one to achieve.
For instance, it is possible to have the same user credentials be used in multiple endpoints and express ideas such as "Allow this and that user to invoke my APIs 1,000 requests/day but limit each endpoint to at most 5 requests/minute no matter which user".
Moreover, because limits can be set on services, it is possible to make it even more flexible, e.g., "Let this service be invoked at most 10,000 requests/hour, no matter which user it is, with particular users being able to invoke at most 500 requests/minute, no matter which service, topping it off with per separate limits for REST vs. SOAP vs. JSON-RPC endpoint, depending on what application invokes the endpoints". That lets one conveniently express advanced scenarios that often occur in practical situations.
Also, observe that API rate limiting applies to REST, SOAP, and JSON-RPC endpoints only; it is not used with other API endpoints, such as AMQP, IBM MQ, SAP, task scheduler, or any other technologies. However, per-service limits work no matter which endpoint the service is invoked with, and they will work with endpoints such as WebSockets, ZeroMQ, or any other.
Lastly, limits pertain to incoming requests only - any outgoing ones, from Zato to external resources - are not covered by it.
Per-IP Restrictions
The architecture is made even more versatile because for each object - endpoint, user, or service - different limits can be configured depending on the caller's IP address.
This adds yet another dimension and allows to express ideas commonly witnessed in API-based projects, such as:
- External applications, depending on their IP addresses, can have their own limits.
- Internal users, e.g., employees of the company using VPN, may have hire limits if their addresses are in the 172.x.x.x range.
- For performance testing purposes, access to Zato from a few selected hosts may have no limits at all.
IP-based limits work hand in hand are an integral part of the mechanism - they do not rule out per-endpoint, user, or service limits. In fact, for each such object, multiple IP-using limits can be set independently, thus allowing for the highest degree of flexibility.
Exact or Approximate
Rate limits come in two types:
- Exact
- Approximate
Exact rate limits are just that, exact - they en that a limit is not exceeded at all, not even by a single request.
Approximate limits may let a minimal number of requests exceed the limit, with the benefit being that approximate limits are faster to check than exact ones.
When to use which type depends on a particular project:
In some projects, it does not really matter if callers have a limit of 1,000 requests/minute or 1,005 requests/minute because the difference is too tiny to make a business impact. Approximate limits work best in this case.
There may be requirements in other projects that the limit never is exceeded, no matter the circumstances. Use exact limits here.
Python Code and Web-Admin
Alright, let's check how to define the limits in Zato web-admin. We will use the sample service below:
# -*- coding: utf-8 -*-
# Zato
from zato.server.service import Service
class Sample(Service): name = 'api.sample'
def handle(self):
# Return a simple string on response
self.response.payload = 'Hello there!\n'
In web-admin, we will configure limits - separately for the service, a new REST API channel (endpoint).
Points of interest:
- Configuration for each type of object is independent - within the same invocation, some limits may be exact, some may be approximate.
- There can be multiple configuration entries for each object.
- A unit of time is "m," "h," or "d," depending on whether the limit is per minute, hour or day, respectively.
- All limits within the same configuration are checked in the order of their definition, which is why the most generic ones should be listed first.
Testing it Out
Now, all is left is to invoke the service from curl.
As long as limits are not reached, a business response is returned:
$ curl http://my.user:password@localhost:11223/api/sample
Hello there! $
But if a limit is reached, the caller receives an error message with the 429 HTTP status.
$ curl -v http://my.user:password@localhost:11223/api/sample
* Trying 127.0.0.1...
...
< HTTP/1.1 429 Too Many Requests < Server: Zato
< X-Zato-CID: b8053d68612d626d338b02
...
{"zato_env":{"result":"ZATO_ERROR","cid":"b8053d68612d626d338b02eb", "details":"Error 429 Too Many Requests"}}
$
Note that the caller never knows what the limit was - that information is saved in Zato server logs and other details so that API authors can correlate what callers get with the very rate-limiting definition that prevented them from accessing the service.
zato.common.rate_limiting.common.RateLimitReached:
Max. rate limit of 100/m reached;
from:`10.74.199.53`, network:`*`;
last_from:`127.0.0.1; last_request_time_utc:`2020-11-22T15:30:41.943794;
last_cid:`5f4f1ef65490a23e5c37eda1`; (cid:b8053d68612d626d338b02)
And this is it - we have created a new API rate limiting definition in Zato and tested it out successfully!
Opinions expressed by DZone contributors are their own.
Comments