Over a million developers have joined DZone.

Apache Ignite 2.5: Scaling to 1000s Nodes Clusters

DZone's Guide to

Apache Ignite 2.5: Scaling to 1000s Nodes Clusters

Read this article in order to learn more about what's new in Apache Ignite.

· AI Zone ·
Free Resource

Insight for I&O leaders on deploying AIOps platforms to enhance performance monitoring today. Read the Guide.

Apache Ignite was always appreciated by its users for the two primary things it delivers — scalability and performance. Throughout a lifetime, many distributed systems tend to do performance optimizations from a release to release while making scalability related improvements just a couple of times. It's not because the scalability is of no interest.

However, Apache Ignite grew to the point where the community decided to revisit its discovery subsystem that influences how well and far Ignite scales out. The goal was pretty clear — Ignite has to scale 1000s of nodes as good as it scales 100s now.

It took many months to get the task implemented. So, please join me in welcoming Apache Ignite 2.5 that now can be scaled easily to 1000s of nodes and comes with other exciting capabilities. Let's check out the most prominent ones.

Massive Scalability

There are two components of Ignite that were modified in Ignite 2.5 to improve its scalability capabilities. The first one is related to 1000s nodes clusters while the other is related to the way we train machine learning (ML) models in Ignite. Let's start with the first.

Marrying Apache Ignite and ZooKeeper

The 1000s nodes scalability goal was solved with the help of Apache ZooKeeper. Why did we turn to it?

Apache Ignite default TCP/IP Discovery organizes cluster nodes into a ring-topology form that has its advantages and disadvantages. For instance, on topologies with hundreds of cluster nodes, it can take many seconds while a system message traverses through all the nodes. As a result, necessary processing of events such as the joining of new nodes or detecting failed ones can take a while, affecting the overall cluster responsiveness and performance. That is a big deal if you'd like to run 1000s nodes clusters.

The new ZooKeeper Discovery uses ZooKeeper as a single point of synchronization where Ignite nodes are exchanging discovery events through it. It solved the issue with long-to-be-processed discovery messages and, as a result, allowed Ignite scaling to large cluster topologies.

As a rule of thumb, keep using default TCP/IP Discovery if it's unlikely that your Ignite cluster scales beyond 300s nodes, and switch to ZooKeeper Discovery if that's the case.

Machine Learning: Partition-Based Datasets

That's the second prominent feature of Ignite 2.5 that improves the way of how far you can scale your Ignite clusters to train ML models over terabytes or petabytes of data. The partition-based datasets moved us closer to the implementation of Zero-ETL concept, which implies that Ignite can be used as a single storage where ML models and algorithms are being improved iteratively and online without ETLing data back and forth between Ignite and another storage.

Read more about the datasets from this documentation page.

Genetic Algorithms

Ignite's ML component is ramping up, and in the version 2.5, it accepted a contribution of genetic algorithms (GAs), which help to solve optimization problems by simulating the process of biological evolution. GAs are excellent for searching through large and complex data sets for an optimal solution. Real world applications of GAs include automotive design, computer gaming, robotics, investments, traffic/shipment routing, and more.
Refer to excessive articles of my community-mates Turik Campbell and Akmal B. Chaudhri, which cover the main benefits of GAs:

Continuous Self-Healing and Consistency Checks

It's a known fact that many companies and businesses trusted Ignite in its mission-critical deployments and solutions. As a result, sometimes Ignite doesn't even have a right to "misfire" and should be able to handle critical or unpredictable situations automatically or provide facilities to deal with them manually.
With Ignite 2.5, we've kicked off the realization of continuous self-healing concept that implies that no matter what happens with Ignite in production, it should be able to tolerate unexpected failures and stay up and running. The following was done in 2.5:

SQL: Security and Fast Data Loading

The community stays strong and determined in its goal of making Ignite SQL engine undistinguishable from SQL engines of famous and mature SQL database. What's the purpose? We want to make it easy for you to migrate from a relational database to Ignite so you can reuse all the skills you gained before. Overall, this is what our SQL engine got in 2.5:

In-place Execution of Spark DataFrame Queries

Apache Spark users can applaud because the following ticket got merged in 2.5. In short, it means that from now on, Ignite will be able to execute as many DataFrames SQL queries as it can put in place on the Ignite servers side avoiding data movement from Ignite to Spark. The performance of your DataFrames queries should boost significantly. Enjoy!

DEB and RPM Packages

Last but not least, if you're a Linux user, you can now install the latest Ignite versions directly from DEB and RPM repositories. Refer to how-to and share your feedback with us.

Finally, I have no more paper left to cover other optimizations and improvements. So, go ahead and check out our release notes.

TrueSight is an AIOps platform, powered by machine learning and analytics, that elevates IT operations to address multi-cloud complexity and the speed of digital transformation.

apache ignite ,zookeeper ,in-memory computing ,in-memory data ,in-memory caching ,machine learning ,machine learning algorithms ,genetic algorithms ,artificial intelligence ,ai

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}