Searching for a Scalable Streaming API
There are some use cases where REST APIs aren’t such a great fit.
Join the DZone community and get the full member experience.
Join For FreeLet’s take a moment to appreciate the noble REST API. They’re anywhere and everywhere. REST is the lingua franca of the application world. REST APIs have been the equivalent of software duct tape since Roy T. Fielding published his doctoral thesis all the way back in 2000. Whether you’re talking about Google Docs, Facebook, Snapchat, Uber, Waze, Yelp, or just about anything else, chances are there are hundreds or thousands of REST APIs are creating relationships between application services.
But there are some use cases where REST APIs aren’t such a great fit. For example, if you’re trying to pipeline streaming data. Sure, there’s websockets, which is fine for a few streams, but applications need to open a new websocket connection for every URI to which they want to connect. What about apps with thousands, or millions of streams? Websockets are too inflexible, difficult to aggregate, and expensive to run at that scale. As streaming data becomes even more common, the next ubiquitous API will be built for transporting real-time streaming data, in addition to batch and historical data.
Designing the Perfect Streaming API
If REST and websockets aren’t enough, then we need a new API for today’s data-driven landscape. Let’s take a moment to imagine what the ideal universal API would look like. What use cases must it cover and would it need to do?
Let’s start with the basics. A perfect API would:
Work with batch (REST), historical, and real-time streaming data
Use simple grammar to facilitate many compatible implementations
Support asynchronous (and synchronous) use cases
Provide a universal data type (compatible with JSON, XML, and other popular data languages)
It’s important for our new API to satisfy the streaming data use case first. This is important because unlike REST and historical datasets, data streams require stateful endpoints in order to efficiently process data in-memory, as opposed to storing streams before processing (which inevitably ends in bufferbloat).
By supporting both synchronous and asynchronous use cases, we ensure that the same API can be used for both REST and streaming data sources. Simple grammar makes our new APIs easy to understand and configure. Likewise, utilizing a universal data type ensure that the API can be used to repeatably solve a variety of use cases. There are also some other, more advanced considerations such as making design decisions about syntax, composability, polymorphism, and parsing.
Evaluating Streaming APIs Today
There are a few options if you’re looking for a streaming API today. For example, Spring Cloud Stream or gRPC may get you part of the way there. However, building the endpoints of the stream is equally challenging. A stateful application layer becomes essential to route streams to granular endpoints within server clusters.
It makes sense that utilizing a different API would have further reaching effects on how an application is architected. This is where the open source Swim platform comes in; it solves the problem of scaling a stateful application architecture that can efficiently maintain distributed real-time coherency. Swim uses the WARP streaming protocol, which is primarily a semantic model for streaming cache coherency between stateful application objects. WARP is an alternative model to RPC, and by extension, REST. The WARP coherency model can be implemented over many transports, just as the RPC model can.
Building a Stateful Streaming API
In order to construct an API using Swim, it's important to understand the primitives of Swim's architecture. The fundamental building block of Swim is a WebAgent, which is defined on the Swim Server. Each Web Agent is assigned a unique URI, so that it can be addressed by other Agents. Web Agents have Lanes, which are like data objects, each being identified by a name unique to that Agent. A Swim API is a combination of the host name (where the Swim Server is running), the nodeURI
(URI of the Web Agent) and the laneURI
(lane name). It is similar to a REST endpoint except that it's a streaming data API.
Accessing a Swim API
Swim provides a client tool (CLI) to subscribe/send data to/from a Swim server. The following primarily describes accessing a Swim API, however the prerequisite is having a Swim server with Swim APIs, as described in the previous section. Swim APIs are continuously streaming data, unlike REST APIs which must pull data and are not continuous.
Install Swim CLI
> npm i @swim/cli
Swim API Subscription with the CLI
A Swimsync
request is similar to an HTTPGET
method in REST except that it subscribes to a data stream.
To subscribe to a Swim API:> swim-cli sync -h <hostUri> -n <nodeUri> -l <laneUri>
A Swimget
request is a one-time subscription. It is exactly similar to an HTTPGET
request.
Toget
data from a Swim API:
> swim-cli get
-h <hostUri> -n <nodeUri> -l <laneUri>
Try a Real-World Example
Using this pattern, Swim APIs can handle real-time streaming, batch, and historical data in a variety of formats. Furthermore, Swim vastly simplifies doing aggregations and transformations across streams of data. Swim has even open sourced an example application, which transforms the publicly available NextBus API into stateful data streams reporting about transit agencies in the United States. The example app is open source and available on GitHub here.
Using the Swim transit app, here are some examples of APIs you can subscribe to:
To get a real-time stream for all vehicles in Northern California (aggregation):
swim-cli sync -h ws://transit.swim.services -n /state/US/N-CA -l vehicles
To get a real-time stream for all vehicles that belong to the SF-muni agency:
swim-cli sync -h ws://transit.swim.services -n /agency/US/N-CA/sf-muni -l vehicles
To get data for all vehicles in Northern California:
swim-cli get -h ws://transit.swim.services -n /state/US/N-CA -l vehicles
To get data for all vehicles in Northern California:
swim-cli get -h ws://transit.swim.services -n /agency/US/N-CA/sf-muni -l vehicles
Learn More
Share your thoughts about streaming APIs below and let us know what you're building using the open source Swim platform.
You can get started with Swim here, and make sure to STAR us on GitHub.
Published at DZone with permission of Bradley Johnson. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments