Red-Black Trees in C#: A Guide to Efficient Self-Balancing Binary Search Trees
The article discusses red-black trees, a type of efficient self-balancing binary search tree in C#. It covers their unique algorithms and optimization techniques.
Join the DZone community and get the full member experience.
Join For FreeWelcome back to the final article in our series on binary search trees in C#. In our previous articles, we explored the fundamentals of binary search trees and the self-balancing AVL trees. We learned that while AVL trees guarantee a balanced tree structure, they require significant computational overhead to maintain balance factors and execute multiple rotations.
In this article, we will delve into another self-balancing binary search tree, the red-black tree. Red-black trees are designed to strike a balance between the efficiency of operations and the maintenance of a balanced tree structure. Unlike AVL trees, red-black trees use a color coding scheme to balance the tree, making it a more efficient alternative in certain scenarios.
In this article, we will explore the unique characteristics of red-black trees, their properties, and the algorithmic operations required to maintain a balanced tree structure. Additionally, we will discuss methods to optimize balancing and techniques to maximize the efficiency of our red-black tree.
So, join us as we discover the world of red-black trees and their advantages over other types of binary search trees.
If you're new to the series, we recommend reading our previous articles Unlocking the Potential of Binary Search Trees with C# Programming and Understanding AVL Trees in C#: A Guide to Self-Balancing Binary Search Trees, before diving into this one. Having a solid understanding of the basics of binary search trees and AVL trees will provide a strong foundation for understanding the implementation of red-black trees in C#. With this knowledge, you will be better equipped to follow along with the C# code snippets and understand the underlying logic behind the operations performed on red-black trees.
Fundamental Principles of Red-Black Trees
Unlike the regular Binary Search Tree and AVL Tree, a leaf in a Red-Black Tree is not a node without children, but rather a non-existent node, i.e., NULL. The root node should always be black. Additionally, every red node must have two black children.
However, the main innovation is the so-called black height. This is the number of black nodes on any path from the root node to NULL. NULL is considered because it is black by default, while the root node is not.
The reason why it's on any path is that regardless of which path we take, the number of black nodes should always be the same. Otherwise, the balance of the tree will be disrupted.
For simplicity, we will create a special Empty object in memory that will act as a placeholder. All node references will now point to Empty instead of NULL. Additionally, each node will have a reference to its parent, which allows us to perform insertion and deletion operations without any recursion.
public class BinarySearchTree<T> where T : IComparable<T>
{
...
private Node _head = Node.Empty;
public void Add(T value)
{
if (_head.IsEmpty)
{
_head = Node.CreateHead(value);
}
else
{
_head.Insert(value);
}
}
...
}
private class Node
{
public T Value { get; private set; }
public bool IsEmpty => this == Empty;
public static readonly Node Empty = new(default!, null!)
{
_isBlack = true
};
private Node _parent;
private Node _left;
private Node _right;
private bool _isBlack;
private bool IsRed => !_isBlack;
private Node(T value, Node parent)
{
Value = value;
_parent = parent;
_left = Empty;
_right = Empty;
}
public static Node CreateHead(T value)
{
return new Node(value, Empty)
{
_isBlack = true
};
}
...
}
Inserting a New Item
The insertion rules remain relevant since the binary tree times. The only difference is that now, instead of recursion, we can use a regular loop, thanks to the reference to the parent element for each node.
We find a free spot to the left or right and insert the new node there. By default, we agree that the color of each newly added node will be red because its insertion, on average, breaks fewer rules than inserting a black node, which would guarantee a change in the black height of the parent node.
private class Node
{
...
public void Insert(T value)
{
var current = this;
var tail = _parent;
var comparison = 0;
while (!current.IsEmpty)
{
tail = current;
comparison = value.CompareTo(Value);
current = comparison < 0
? current._left
: current._right;
}
var newNode = new Node(value, tail);
if (comparison < 0)
{
tail._left = newNode;
}
else
{
tail._right = newNode;
}
Balance(newNode);
}
}
However, after inserting a red node, the rules that help the tree remain balanced may be violated up to five times. Therefore, our task is to restore them again, making the tree balanced.
To do this, we will need to take into account not only the children and parent but also other relatives of the node. These are the grandparent, uncle, and sibling.
Balancing a Red-Black Tree
The first case is adding a red node as the root.
The other four cases involve adding a contrasting node after a red one. If we encounter any of these situations, we need to perform rebalancing. In the first case, we simply recolor the added root to black.
private class Node
{
...
public static Node CreateHead(T value)
{
return new Node(value, Empty)
{
_isBlack = true
};
}
...
}
The second case looks like this.
We know that a red parent cannot have a red child, so we change the color of the parent to black. This, in turn, causes the black height of the grandfather on the left to not match the black height on the right. Therefore, we also change the color of the uncle to black, and the black height of the grandfather will be leveled, but it will be one unit greater than before our balancing.
To fix this, we simply change the color of the grandfather to red. This will ensure that the black height for higher nodes remains the same as it was. And our tree will become balanced.
The third case looks like this.
Here everything starts the same way, meaning we color the uncle's parent black for the same reasons. However, we cannot change the color of the grandparent to red because it is the root. But we don't need to do that because the black height of the original node is not taken into account. Therefore, the black height of our grandparents will remain the same as before balancing.
In the fourth case, we have a zigzag.
We need to straighten it out by performing a left rotation relative to the parent of the new node. This, in turn, automatically takes us to the fifth case when we get a straight line.
The parent of the red node cannot be red again. Therefore, we color it black. We break the black height for the grandparent. Therefore, we make it red and perform a right rotation relative to it.
Sure, let's write the code for all this logic.
private class Node
{
...
private void RotateToRight()
{
if (_left.IsEmpty)
{
throw new InvalidOperationException(
"Could not rotate to the right.");
}
SwapValue(this, _left);
var formerRight = _right;
_right = _left;
SetLeft(_right._left);
_right.SetLeft(_right._right);
_right.SetRight(formerRight);
}
private void RotateToLeft()
{
if (_right.IsEmpty)
{
throw new InvalidOperationException(
"Could not rotate to the left.");
}
SwapValue(this, _right);
var formerLeft = _left;
_left = _right;
SetRight(_left._right);
_left.SetRight(_left._left);
_left.SetLeft(formerLeft);
}
private void SetRight(Node node)
{
_right = node;
if (!node.IsEmpty)
{
node._parent = this;
}
}
private void SetLeft(Node node)
{
_left = node;
if (!node.IsEmpty)
{
node._parent = this;
}
}
private void SetBlack()
{
if (!IsEmpty)
{
_isBlack = true;
}
}
private void SetRed()
{
if (!IsEmpty && !_parent.IsEmpty)
{
_isBlack = false;
}
}
private void Balance(Node node)
{
while (!node.IsEmpty && _parent is { IsEmpty: false,
_parent.IsEmpty: false,
IsRed: true })
{
var uncle = GetUncle(node, out var isUncleRight);
if (uncle.IsRed)
{
node._parent.SetBlack();
uncle.SetBlack();
node._parent._parent.SetRed();
node = node._parent._parent;
continue;
}
if (isUncleRight)
{
if (node == node._parent._right)
{
node = node._parent;
node.RotateToLeft();
}
node._parent._parent.RotateToRight();
node._parent.SetBlack();
node._parent._parent.SetRed();
}
else
{
if (node == node._parent._left)
{
node = node._parent;
node.RotateToRight();
}
node._parent._parent.RotateToLeft();
node._parent.SetBlack();
node._parent._parent.SetRed();
}
}
}
private Node GetUncle(Node node, out bool isUncleRight)
{
if (node._parent == node._parent._parent._left)
{
isUncleRight = true;
return node._parent._parent._right;
}
isUncleRight = false;
return node._parent._parent._left;
}
}
Deleting Nodes in a Red-Black Tree
Alright, now let's talk about the process of deleting nodes. It works in the same way as deleting nodes from a regular tree until the balancing step. It's important to understand that the physical removal of nodes from our trees only happens in two cases: if the node being deleted has no children or if it has only one child.
This is because when a node being deleted has two children, it is not physically removed, meaning the tree will have the same number of nodes as before. Instead, we replace it with one of its children, either the maximum node from its left subtree or the minimum node from its right subtree.
Only after this do we physically delete the minimum or maximum node, which will be replaced by either a dummy node or a single child.
It is from these three positions in the Red-Black Tree that we begin to consider cases where balancing is necessary.
First of all, if the physically removable node is red, it means that it definitely has 0 children.
This is because if it had one child for some reason, the tree would be unbalanced. However, this cannot happen because we balance it during insertion.
So, since it definitely has no children, we simply replace it with Empty, and no further balancing is needed. This is unlike deleting a black node.
If it has one red child, then after it comes to its place, the black height of the parent on one side will be reduced by one.
Therefore, to restore it, we simply recolor the incoming red node to black. But what if the child of the node to be deleted was black? Then we should pay attention to the sibling of the deleted element. It can either be black with children of unknown color, or it can be red with black children. Then the parent of the deleted element will definitely be black, according to our rules. In both cases, the black height must be the same for the node being deleted on both sides.
The only case where this is possible is when the node being removed has no children. Therefore, we replace it with Empty. Since a black node has been removed from the left side, the black height of the parent node will decrease, and we begin to balance the tree again.
private class Node
{
...
public bool Remove(T value, out Node root)
{
if (IsEmpty)
{
root = this;
return false;
}
var node = Find(value);
if (node is null)
{
root = this;
return false;
}
Node child;
root = this;
if (node.GetChildrenCount() < 2)
{
child = node.GetChild();
if (node == this)
{
root = child;
}
node.ReplaceWith(child);
}
else
{
var minNode = node.Min();
node.Value = minNode.Value;
node._isBlack = minNode._isBlack;
child = minNode.GetChild();
minNode.ReplaceWith(child);
}
BalanceOnRemove(child, this);
return true;
}
private int GetChildrenCount()
{
var count = 0;
if (!_left.IsEmpty) count++;
if (!_right.IsEmpty) count++;
return count;
}
private void ReplaceWith(Node node)
{
if (_parent.IsEmpty)
{
node._parent = _parent;
}
else if (this == _parent._left)
{
_parent.SetLeft(node);
}
else
{
_parent.SetRight(node);
}
}
private Node GetChild()
{
return !_left.IsEmpty ? _left : _right;
}
...
}
Balancing After Removal
If the sibling was black, then it had either two red children, two black children, one red child on the left and one black child on the right, or one red child on the right and one black child on the left. For the first and third cases, the balancing will be the same. We simply recolor the sibling to the color of the parent.
This will not affect the black height of the sibling, but it will change the black height of the parent. Therefore, we recolor the parent of the sibling's red child to black and perform a left rotation about the parent node.
As a result, a node of the same color as before removal is now in the parent's position, and the black heights on both sides are equal. To balance the tree in the fourth case, we need to recolor the sibling to red and the left child of the sibling to black. Then, we perform a right rotation about this sibling, and we arrive at the previous case we just discussed.
The last case is when the Black brother has two Black children, but the color and hierarchy level of the parent is unknown. This creates three more possible cases, where the parent can be a Blackroot, a non-root Red, or a non-root Black.
In the first case, we paint the uncle Red, which automatically restores the equality of the Black height for the parent. In the second case, we paint the uncle Red and the parent Black. In the third case, we paint the parent Red, perform a left rotation relative to it, and fix the situation that we have already discussed earlier.
The last case is when the sibling of the deleted node is of Red color. In this case, we make the sibling Black, the parent Red and perform a left rotation relative to the parent node.
private class Node
{
...
private void BalanceOnRemove(Node node, Node root)
{
while (node is { IsEmpty: false,
_parent.IsEmpty: false,
_isBlack: true })
{
Node brother;
if (node == node._parent._left)
{
brother = node._parent._right;
if (brother.IsRed)
{
node._parent.RotateToLeft();
node._parent.SetRed();
brother = node._parent._right;
}
if (brother._left._isBlack && brother._right._isBlack)
{
brother.SetRed();
node = node._parent;
}
else
{
if (brother._right._isBlack)
{
brother._left.SetBlack();
brother.SetRed();
brother.RotateToRight();
brother = node._parent._right;
}
brother._isBlack = node._parent._isBlack;
node._parent.SetBlack();
brother._right.SetBlack();
node._parent.RotateToLeft();
node = root;
}
}
else
{
brother = node._parent._left;
if (brother.IsRed)
{
node._parent.RotateToRight();
node._parent.SetRed();
brother = node._parent._right;
}
if (brother._left._isBlack && brother._right._isBlack)
{
brother.SetRed();
node = node._parent;
}
else
{
if (brother._left._isBlack)
{
brother._right.SetBlack();
brother.SetRed();
brother.RotateToLeft();
brother = node._parent._left;
}
brother._isBlack = node._parent._isBlack;
node._parent.SetBlack();
brother._right.SetBlack();
node._parent.RotateToRight();
node = root;
}
}
}
}
...
}
Conclusion
All of this looks more like solving a Rubik's cube puzzle, where it is relatively easy to memorize the steps. Each of these steps has a logical explanation, which unfortunately cannot be explained within the scope of this article.
Scientists have been working on creating this tree for 40 years. The current form of the tree is the maximum that the best minds on our planet have been able to come up with. Nevertheless, all of this clearly demonstrates that the work of a red-black tree can be broken down even from scratch if you carefully observe what is happening.
Although a red-black tree minimizes the drawbacks of AVL, it still cannot fully replace it. Red-black tree, indeed, works faster when inserting new nodes. However, it is slower than AVL when searching for data.
That is why these two types are often used in combination depending on specific tasks. Therefore, it is desirable to have at least a theoretical understanding of the work of each of them.
Opinions expressed by DZone contributors are their own.
Comments