A cluster management framework, Apache Helix
Join the DZone community and get the full member experience.
Join For FreeWhat is Helix?
It is used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes. Helix automates reassignment of resources in the face of node failure and recovery, cluster expansion, and reconfiguration. Modeling a distributed system as a state machine with constraints on states and transitions.
Terminologies
- Node : A single machine
- Cluster: Set of Nodes
- Resource : A logical entry (e.g. database, index, task)
- Partition: Subset of the resource (Each subtask is referred to as a partition)
- Replica: Copy of a Partition State (e.g Master, Slave). It increase the availability of the system
- State: Describes the role of a replica (Each node in the cluster has its own Current State)
- State Machine and Transitions: An action that allows a replica to move from one state to another, thus changing its role. ( e.g Slave --> Master )
- spectators: the external clients. Helix provides an External View that is an aggregated view of the current state across all nodes.
- Current State: represents resource's actual state at a participating node.
- INSTANCE_NAME: Unique name representing the process
- SESSION_ID: ID that is automatically assigned every time a process joins the cluster - Rebalancer: The core component of Helix is the Controller which runs the Rebalance algorithm on every cluster event.
- Dynamic Ideal State: Helix powerful is that Ideal State can be changed dynamically. It is adjusting the ideal state. Whenever a cluster event occurs, Helix can operate in one of three modes
- FULL_AUTO
- SEMI_AUTO
- CUSTOMIZED
Cluster events can be one of the following:
- Nodes start and/or stop
- Nodes experience soft and/or hard failures
- New nodes are added/removed
Published at DZone with permission of Madhuka Udantha, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments