Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Binary Search Tree (Stuff Formally Trained Programmers Know)

DZone's Guide to

Binary Search Tree (Stuff Formally Trained Programmers Know)

If you're a developer and aren't familiar with this data structure, then it's about time you got to know it. Read on for the basics.

· Big Data Zone ·
Free Resource

The open source HPCC Systems platform is a proven, easy to use solution for managing data at scale. Visit our Easy Guide to learn more about this completely free platform, test drive some code in the online Playground, and get started today.

This post is one in a series of stuff formally trained programmers know – the rest of the series can be found in the series index.

Binary Search Tree

In the previous post, we covered a Binary Tree, which is about the shape of storing the data. The Binary Search Tree (BST) is a further enhancement to that structure.

The first important change is that the data we are storing needs a key; if we have a basic type like a string or number then the value itself can be the key and if we have a more complex class, then we need to define a key in that structure or we need to build a unique key for each item.

The second change is a way to compare those keys which is crucial for the performance of the data structure. Numbers are easiest since we can easily compare which is larger and smaller.

The third and final change is the way we store the items; the left node's key will always be smaller than the parent node's key and the right node's key will be larger than the parent node.

As an example, here is a BST using just numbers as keys:
A BST

Note that all nodes to the left are smaller than their parent and all parents above that.

Why?

So, why should we care about a BST? We should care because searching is really performant in it as each time you move a depth down, you eliminate approximately 50% of the potential nodes.

So, for example, if we wanted to find the item in our example with the key 66, we could start at the root (50) and move right. At that point, we have eliminated 8 possible nodes immediately. The next is to the left from the node with the 70 (total possible nodes removed 12). Next is to the right of the node with the value of 65, and then to 66 to the left of 67. So we found the node with 5 steps.

Going to Big O Notation, this means we achieved a performance of close to O(log n). It is possible to have a worst case of O(n), when the tree is not Optimal or Unbalanced.

Balanced Versus Unbalanced

In the above example we have a binary search tree which is Optimal, i.e. it has the lowest depth needed. Below we can see a totally valid BST; each child node is to the right of the parent because it is bigger than the parent.

Unbalanced BST

This, however, will result in an O(n) search performance which is not ideal. The solution is to rebalance the BST and for our example above we can end up with multiple end states (I'll show two below). The key takeaway is that we go from a depth of 5 to a depth of 3.

Balanced BST

Implementations

.NET has a built-in implementation with SortedDictionary. Unfortunately, nothing exists out-of-the-box for this in JavaScript or Java.

Managing data at scale doesn’t have to be hard. Find out how the completely free, open source HPCC Systems platform makes it easier to update, easier to program, easier to integrate data, and easier to manage clusters. Download and get started today.

Topics:
binary search tree ,data structure ,implementation ,nodes

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}