Scalable AI and API Architectures in Python
The architecture of Zato reflects several key foundational concepts underlying the design of the platform. Each architecture component takes all concepts into account.
Join the DZone community and get the full member experience.Join For Free
The architecture of Zato reflects several key foundational concepts underlying the design of the platform. Each component of the architecture takes each of the concepts into account.
The tenets are what drives the design of Zato, this is what directly leads to what its architecture looks like.
|A broad usage spectrum||From IoT, through APIs, file transfer, enterprise backend systems and mainframe to AI & Machine Learning. Zato is meant to be used to build a wide range of integrated systems.|
|Productivity||Developer time is of utmost importance. The design of the platform makes it easy to quickly build both simple and complex integration environments. Python is the most productive tool for integrations and this is why Zato is in Python.|
|Operational excellence||Ease of use and monitoring capabilities allow one to constantly improve processes, plans and procedures. Results of your work should be easily reproducible in different contexts, environments or projects.|
|Scalability||It should be easy to scale environments regardless of one’s preferred deployment approach, be it cloud based, on promises, hybrid, bare metal, Docker or Kubernetes. Any combination can be used.|
|High availability||Integrations are the very core of any organisation or project and it is essential that the platform eliminate single points of failure, that it provide redundancy and that it offer convenient means to carry out upgrades or maintenance tasks.|
|Security||An integration platform needs to expect that it will be routinely attacked by nefarious actors. The very choice of Python, a very high level, secure language and the platform’s resilience to attacks are an integral part of the design.|
|Simplicity over complexity||The correct way to build advanced, mission-critical systems is to make them as simple as possible, but no simpler. Individual components and parts should be easy to understand and master.|
|CLI and API||
Understanding Zato Servers
- There are no limits as to how many servers there can be in a single cluster.
- By default, all servers in a cluster are always active and the load balancer will direct traffic to all of them.
- It is possible to take a server offline, e.g. to apply updates, and the load balancer will redirect the traffic to other servers.
- As long as a server is running, it synchronizes its state with other members of the clusters, even if that server is offline. For instance, code deployed to any server will be auto-distributed to all the other servers, even if from the load balancer’s perspective any of them is offline.
Containers for High-Performance Services
- Servers are containers onto which API services are deployed.
- There are no limits as to how many services there can be in a single server.
- Each idle service consumes up to 1 MB of RAM. Thus, 1 GB of RAM can mean 1,000 business API or AI services.
- A service takes less than 1 ms to deploy. It takes less than 1 second to deploy 1,000 business API or AI services.
- All servers from the same cluster are always mirroring images in terms of what code, what services, they execute.
Scaling The Environment - APIs & AI
The most important aspect of whether to add more servers or more clusters with their own servers is understanding the distinction between services that are network-bound vs. services that are CPU-bound.
Services are network-bound if they primarily wait for TCP networks. For instance, picture a sample REST or AMQP service that may take 100 ms to complete. Of that, 98 ms are spent waiting for a remote endpoint or server to respond while only 2 ms are actually spent on the actual processing of the data received. This means that the service spends 98% of its time not actively processing anything, it is bound to the network. Hence, the name, network-bound.
Services are CPU-bound if they are primarily blocked, waiting for CPUs to compute an expected result. For instance, imagine a service that requires 200 ms to obtain some data, and then its AI algorithms require two minutes to complete. In this case, the service spends most of its time waiting for the CPU. Hence the name, CPU-bound. Another example may be the parsing and processing of large files, e.g. multi-GB files may require CPU time to parse.
Because each Zato server in the same cluster executes the same set of services, network-bound ones should not be mixed with CPU-bound services. If they are mixed, if they are deployed to the same cluster, it may happen that CPU-bound services completely overtake CPUs, leaving no room for network-bound services. For instance, if many AI services require CPU time and a REST (TCP) request arrives, the CPUs may be completely busy with AI calculations, leaving very little or no processing time for network events.
It is perfectly fine and expected to have more than one cluster, depending on whether the workload is uniform, e.g. only network-bound or only CPU-bound as opposed to mixed workloads, containing services of both types. With mixed workloads, it is recommended to have more than one cluster.
Scaling a Cluster
An individual cluster can be scaled by adding more servers with smaller numbers of CPUs for each server or by adding more CPUs to each server.
Usually, it is more desirable to add more smaller servers than more CPUs per server. The reason is that, in the true spirit of cloud computing, there are no limits as to how many servers can be added whereas, broadly, the limit of CPUs per server is between 6 and 8, depending on a particular CPU make and model, and adding more CPUs above the limit does not significantly improve performance.
Servers with services using publish/subscribe are a special case in that they always require exactly 1 CPU per server. In this scenario, clusters are scaled by adding more servers, each with 1 CPU.
Start the tutorial to learn more technical details about Zato, including its architecture, installation, and usage. After completing it, you will have a multi-protocol service representing a sample scenario often seen in banking systems with several applications cooperating to provide a single and consistent API to its callers.
Published at DZone with permission of Dariusz Suchojad. See the original article here.
Opinions expressed by DZone contributors are their own.