Hello and welcome! This is part two of a blog post series on the binary search tree data structure.

If you have not real my previous post, I highly recommend you to read introduction to the binary search tree before proceeding.

Finally, in this post, we will be writing our own binary search tree implementation, dissecting the key operations and understanding exactly how the binary search tree works under the scenes.

Binary Search Tree Implementation API

Attached below is our Binary search tree interface. The available public methods are

add(data) : Should be common sense that consumers should be able to add items to the tree.

: Should be common sense that consumers should be able to add items to the tree. remove(data) : Search for the data to remove and remove it from the tree.

: Search for the data to remove and remove it from the tree. max() : Return the “largest” data from the set by traversing to the far right of the tree.

: Return the “largest” data from the set by traversing to the far right of the tree. min() : Return the “smallest” data from the set by traversing to the far left of the tree.

Re-iterating the importance of assumptions, it is because of the rules, that we are able to implement the min() and max() method in a sound and cohesive manner. Once you have gotten the hang of Binary Search Trees, feel free to extend the API and add your own methods.

Prerequisite knowledge

Let me warn you beforehand: This topic is advanced JavaScript. I am certain that the process alone will make you huff and puff. However, if you stick through and digest all the content, by the end of it, you will have a solid understanding of how the binary search tree implementation works. Below are a list of following topics that every reader should know before tackling this post.

In conclusion, readers should have a solid working knowledge of JavaScript as well as Object-oriented programming experience.

Overview

This section will make you huff and puff. But in the end, my sincerest hope, is that every reader is able to understand how binary search tree works. Since the source code is fairly involved, this upcoming section will be divided into smaller subsection.

Creating the BinarySearchTree and Node constructor. add() behind the scenes. Also commonly referred to as insert() remove() behind the scenes. min() behind the scenes. max() behind the scenes.

The implementation details will be encapsulated and the public API exposed via the module pattern. Please bear in mind that I have removed the need for developers to type new to instantiate a Node object or the Binary Search Tree. Therefore, in future code snippets, I will be omitting the new keyword.

The Node object will be encapsulated in the module. Consumers of the API will only have access to the methods available on the binary search tree prototype chain.

What is a sub-tree?

I have and will continue to mention sub-tree in any tree-related posts. Therefore, I thought it would be best if we figure out what a sub-tree is before proceeding. A sub-tree is basically derived from a child-node of a node in a tree. In essence, we are saying that if we broke off the child node, the child node (and its child if it has any) is a binary tree in it of itself.

To sum it up: A sub-tree of a binary search tree is also a binary search tree. This might be confusing in text form, so without further ado, let us take a look at a visual demonstration.

The root node 10 has two children: 5 and 23. Both of these are sub-trees. Because if we look at those two items alone (taking away the root node), they are binary search trees. By definition (as discussed in a previous post), a BST is a BST when the left child is less than its parent and the right child is greater than its parent, and also when a node has at most, 2 child nodes.So, if we were to prove that the nodes containing 5 and 23 are sub-trees, we would have to prove that they are, by definition, binary search trees. Lets take a look at a visual diagram of the sub-trees.

Judging by the fact that the left child is less than the parent and the right child is greater than the parent, both 5 and 23, by definition, are binary search trees. It is because of the fact that each child is a sub-tree that the recursive approach becomes very applicable when writing a binary search tree implementation.

Creating the BinarySearchTree and Node Constructor

Finally, with all the formalities aside, lets start writing some code. The BinarySearchTree has the following properties.

size : Track the number of elements inside the tree.

: Track the number of elements inside the tree. root : The pointer to the root node, which is the starting point for all our operations.

: The pointer to the root node, which is the starting point for all our operations. dataType : The data type of the binary search tree elements (E.g. number, string, etc.). The data type property is specific to the JavaScript implementation in order to ensure type safety.

: The data type of the binary search tree elements (E.g. number, string, etc.). The data type property is specific to the JavaScript implementation in order to ensure type safety. compare : Stores the algorithm for comparing data.

: Stores the algorithm for comparing data. equals : Stores the algorithm for checking data equality.

function BinarySearchTree() { // We won't be using new Keyword to create an object. if (!(this instanceof BinarySearchTree)) { return new BinarySearchTree(); } this.size = 0; this.root = null; // Root of the binary search tree. this.compare = function compare(a, b) { // Default comparator function. return a > b; }; this.equals = function equals(a, b) { // Default equals comparator return a === b; }; this.dataType = null; // Determines the data type of the tree based on the type of the first element inserted }

The node constructor will be created for the purpose of linking the affiliated nodes within the tree.

data : Represents the data stored.

: Represents the data stored. parentNode : A pointer to the parent node. In this implementation, it will help with linking parent and child nodes.

: A pointer to the parent node. In this implementation, it will help with linking parent and child nodes. leftChild : A pointer to the left child.

: A pointer to the left child. rightChild : A pointer to the right child.

The implementation details of the Node will be encapsulated. Therefore, the consumers of the API will not be able to directly create a node. Nodes will only be created using the add() method in the BinarySearchTree public API.

function Node(data, leftChild, rightChild) { if (!(this instanceof Node)) { return new Node(data, leftChild, rightChild); } this.data = data; this.parentNode = null; this.rightChild = rightChild; this.leftChild = leftChild; }

Implementing the add/insert() method

Now we are finally unraveling the core of the binary search tree. In this section, we will unravel the add()/insert() implementation details. In the add() method, there are two possible cases which we need to consider

Tree is empty. In another words, there is no data inside of the tree. Tree contains data.

The implementation details are different depending on the case, with the logic becoming slightly more complex when the tree has data prior to inserting.

/** * Add data to the binary tree. * */ BinarySearchTree.prototype.add = function add(data) { if (this.isEmpty()) { addToEmptyTree.call(this, data); // Set the data type of the list setDataType.call(this, data); } else { var insertedItemDataType = getDataType(data); var listDataType = this.dataType; // If the data type is different from what the list accepts, throw error if (insertedItemDataType !== listDataType) { throw new Error("Inserted data: " + data + " is of type ~~~ " + insertedItemDataType + ". This tree only accepts " + listDataType); } addNode.call(this, this.root, data); } incrementSize.call(this); return this; };

Case One: Tree is Empty

When the tree is empty, the insertion logic is very simple. We need to just simply follow the steps below.

Create a new node. Set the new node as the root of the tree. function addToEmptyTree(data) { this.root = Node(data); } Set the data type of the list to be of whatever data was inserted. This is an optional type safety check that I implemented in this binary search tree implementation. For example, if the first item that was inserted into the tree is of type string , then if the user attempts to insert an object as the next element, the binary search tree will throw an error. function setDataType(data) { this.dataType = getDataType(data); }

Case Two: When the tree has data

If the tree has data, the logic becomes slightly more complicated. But don’t worry, we will be walking through each case together, step by step. Below are the big steps we need to take in order to insert data to a tree containing data.

Check whether the inserted data is the same type as the data type of the tree. For example, if the tree is set to only accept strings and the user inserts a number, the tree will throw an error. Add the node to the tree.

The last step sounds very simple logically, but in order to add the node to the tree, we need to figure out where to put it. Unlike the linked-list, the binary search tree, as implied by the word tree, is not a linear data structure. We need to traverse (or search) the tree in order to find the ideal location to place the data.

Afterwards, we need to insert the data and update the reference. This might sound confusing, so I went through the trouble to provide a visual diagram of what updating the reference may look like. The insertion process will look something like this.

Begin method to insert 12. Is 12 greater than 6? No, so go to the right sub-tree with root node of 10. Is 12 greater than 10? No, so go right to 15. Is 12 greater than 15? No it isn’t. Since 15 doesn’t have a left child, instead of traversing, insert 12 as the left child of 15. Set 15 as the parent of node 12.

Traversing the tree – Examining the code

In the previous section, we examined the big picture of how the insertion/add method works. In this section, we will examine the code and figure out how all this works behind the veil. For any operation on the tree data structure, we always start at the root node.

If the data to insert is greater than the root node, we will traverse to the right.

the data to insert is greater than the root node, we will traverse to the right. Otherwise, we will traverse to the left. The default compare method used in the binary search implementation here is as follows. this.compare = function(currentNodeData, dataToInsert) { return currentNodeData > dataToInsert; };

After deciding which direction to traverse, we will check if the right/left child (depending on direction) of the current node exists (or is not null ).

If a child node exists, or != null , traverse to that node and repeat the process.

a child node exists, or , traverse to that node and repeat the process. Otherwise, create a node and set the current node’s child to the data to insert. In the code snippet below, setRight/LeftChild() does two things. Firstly, it creates a new node with the dataToInsert . Secondly, it sets the right child to the newly created node. FYI, this is NOT good practice , as the function is doing more than one thing. I threw this in with the purpose of making sure that readers are paying attention to what is going on. A good way to refactor this part is by doing the following // Create the node first. Afterwards, insert the node into setRightChild. Node newDataNode = Node(dataToInsert); currentNode.setRightChild(dataToInsert); // Refactor setRightChild to only assign the right child. Afterwards, we will need to set the parent node of the newly created and inserted node. To achieve this, get the right child of the currentNode and set its parent node to the currentNode . currentNode.getRightChild().setParentNode(currentNode); // Set parent node

Traversing the tree – Overview

Essentially, in order to add an item to a tree, we need to traverse the tree to find the right place to add the item. After identifying the location to insert, we need to update the references accordingly. Since we need to search the tree to insert at the right location, the insertion process has a big O of log(n). If the tree is unbalanced however, the insertion process could be O(n) especially if the data to insert is bigger than the current max or min value. Updating the references can be done in constant time.

/** * Note that it might be more efficient to use a while loop * rather than using recursion * @this BinarySearchTree * */ function addNode(currentNode, dataToInsert) { // Current data is greater than data to insert. Go down one level to the left. if (this.compare(currentNode.data, dataToInsert)) { var leftChild = currentNode.leftChild; if (leftChild != null) { // traverse the tree until we find the place to insert node. addNode.call(this, leftChild, dataToInsert); } else { currentNode.setLeftChild(dataToInsert); currentNode.getLeftChild().setParentNode(currentNode); // Set parent node } // Current data is less than data to insert. Go down one level to the right. } else { var rightChild = currentNode.rightChild; if (rightChild != null) { // call method recursively until we find the place to insert node. // We can also do it iteratively addNode.call(this, rightChild, dataToInsert); } else { currentNode.setRightChild(dataToInsert); currentNode.getRightChild().setParentNode(currentNode); // Set parent node } } }

Implementing the remove() method

Fortunately, the logic for removing data has a common operation: searching. In order remove the data, we first need to search the tree to check if it exists. If the object is found, all we need to do is remove the object right? The removing operation is where things get slightly tricky. Why? As mentioned before, the tree has anywhere between zero to two nodes, as well as a parent. Therefore, we need to update the references accordingly. The big question to ash here is:

What is/are the varying factor(s) in the remove() operation?

The primary variable in the remove operation is the number of children a node has. There are three cases. A node can have zero children. Or it can have one or two children.

The second variable here is: does the target of removal have a parent? Unless it is the root node, it will always have a parent.

While keeping in mind these two variables, let us dive deeper into the binary search tree implementation details. Since the tree traversal was discussed in the previous sections, we will skip it and head straight into the remove operation.

Case 1: Deleting a leaf node (no children!)

Obviously, the simplest case is where the node to remove is a leaf node (a node with zero children). All we need to do is

Update the reference of the parent. If the object destroyed was the left child, set the reference to null . Otherwise, if it was a right child, set the reference to null . You get the gist right? Destroy the current node object (the target to destroy).

function removeNode(currentNode, dataToDelete, parentChildIdentifier) { if (currentNode == null) { // No elements. Set root to null return currentNode; } var currentData = currentNode.data, leftChild = currentNode.getLeftChild(), rightChild = currentNode.getRightChild(), parent = currentNode.getParentNode(); // Found data to remove. Destroy the node object. if (this.equals(currentData, dataToDelete)) { // Check how many children this tree has. // Operations are fairly simple for nodes with zero or a single child node. if (leftChild == null && rightChild == null) { if (parent) { parent[parentChildIdentifier] = null; // Deference the current node from the parent object if it has a parent. } console.log("removing " + currentNode.data); return null; // Set the current node to null. }

Case 1: Code Snippet explained

In the example above, I implemented a recursive approach. Personally, I disagree with the fact that just because recursion may sometimes be ineffective, and difficult to understand, that it should be avoided entirely. I actually think recursion can make some logic more readable and therefore, easier to understand.

This becomes evident when we have a big problem that can be broken down into smaller identical/similar problems.

Think of the binary search tree implementation. Didn’t we just establish the fact that the sub-trees in binary search trees are also Binary Search Trees?

Therefore, a binary search tree can be broken down into smaller binary search trees, making recursion a good choice for writing more expressive and readable code. I will write a separate blog post (or series) on recursion in the near future, because I think it is extremely important and useful.

The removeNode() function above is recursive: it will return the current node. And in the prototype, the root is set to the value returned. Why did I do that? Well, if we destroy the root node and it is the only element in the tree, that means it will fall into.

leftChild == null && rightChild == null

What happens here, is simply what was described in the previous session.

Destroy the current object. the current node is set to null . Furthermore, because this is the final value that is being assigned to this.root, the root node is also being set to null . Update the reference of the parent. In the code snippet above, we checked if the parent existed (root nodes that are leaf nodes do not have parents). If it exists, we determined whether the node to delete was a right or left child of the parent (the parentChildIdentifier key), and we set the reference to null .

Case 2: Removing a node with a single child

The operation is very similar to updating a node in the linked list (a node that is neither the head or the tail). Let me give you a visual demonstration.

Thankfully, we only have to update the reference of the parent and of the child of the target to perform the delete. We are able to do so, because of the assumption that the child is a sub-tree. Therefore, we don’t have to traverse the entire tree to make sure that the binary search tree is indeed a BST. Now that we understand the logic, lets dive into the code.

Single Node removal code sample part 1

if (leftChild == null) { // Left child does not exist. console.log("remove a node with a right child"); currentNode = null; // Dereference the current node. if (parent) { console.log("Data to link. Parent: " +parent + ". rightChild: " + rightChild.data); parent[parentChildIdentifier] = rightChild; // Have appropriate parent child node pointer point to the child. } return rightChild; // Set the root node to the right child of the current node. } else if (rightChild == null) { console.log("remove a node with a left child"); currentNode = null; // Dereference the current node. if (parent) { console.log("Data to link. Parent: " +parent.data + ". leftChild: " + leftChild.data); parent[parentChildIdentifier] = leftChild; // Have appropriate parent child node pointer point to the child } return leftChild; // Set the root node to the left child of the current Node }

In the example, we are stating, if the current node has either a left or right child, we will link the parent a child node. Hopefully by now, conceptually, this operation makes sense. In the code snippet above, we are taking the following steps. For your information, I have added console.log() to provide a visual illustration on what is going on behind the scenes.

Remove the current node by setting it to null . In the visual diagram, 15 is now no more. Remove the reference from the parent to truly destroy 15. Return 12.

But wait a second, where is the code that links the 10 and 12? In the code snippet above, take a look at the console. I also highly recommend that the user downloads the code from github and try running it. The source code link is available at the end of this article.

Single Node removal code sample part 2

function removeNode(currentNode, dataToDelete, parentChildIdentifier) { if (currentNode == null) { // No elements. Set root to null return currentNode; } var currentData = currentNode.data; // Found data to remove. Destroy the node object. if (this.equals(currentData, dataToDelete)) { destroyNodeObject.call(this, currentNode, parentChildIdentifier); } // If current data is greater than data to delete. // Go to the left sub-tree. Otherwise, go to the right sub-tree. else if (this.compare(currentData, dataToDelete)) { currentNode.setLeftChild(removeNode.call(this, currentNode.getLeftChild(), dataToDelete, "leftChild")); } else { // traverse right currentNode.setRightChild(removeNode.call(this, currentNode.getRightChild(), dataToDelete, "rightChild")); } console.log("Current node: " + currentNode.data); return currentNode; }

The Previous code snippet included the logic for destroyNodeObject() . Here, we will focus on currentNode.setRight/leftChild() .

This might be difficult to understand conceptually, so bear with me. We are working with the following data set, which is the tree in the diagram at the start of this section.

var binarySearchTree = BST(); binarySearchTree .add(6) .add(3) .add(5) .add(10) .add(7) .add(15) .add(12) .add(11) .add(13); binarySearchTree.remove(15);

As you can see from the console.log() statements that I placed, we traversed to the right twice. Once we arrived at the sub-tree with root node 15, we removed that node (15) and linked the parent (10) with the left child (12). Now, since we reached out base in the recursion, all the recursive calls pop off the call stack. We return the current nodes, which are all the roots of each sub-tree I.e. 10, 6. And the final node with value (6) is what is set as this.root .

Hopefully the visual aid below will help with understanding how removeNode() works.

Case 3 (most difficult): Deleting a node with two children

To the readers who have been reading non-stop: I advise you to take a break. The upcoming section is probably the hardest. However, once we have conquered it, we will pretty much understand exactly how the binary search tree works. So, grab a nice cold drink, rest your eyes, etc and come back later.

Essentially, case three has too many possibilities. In the diagram below, if we delete ten, then how are we going to preserve the binary search tree? To make things clear, we need to somehow reduce case three to either case one or two. Easier said than done right?

First of all, we want to look for the minimum in the right sub-tree of 10. In the example below, this value would be 11. Why do we want the minimum of the right sub-tree? Why not simply the minimum of the left-sub-tree? This is because, if it is the minimum in the right sub-tree, we can guarantee that its value will be BOTH greater than the maximum value in the left sub-tree (8) and lesser or equal to any values in the right sub-tree. Thus, we are able to preserve the definition of a binary search tree.

Afterwards, all we need to do is detach 11 (set the leftChild of 12 to null). Replace data 10 with 11. Voila!

Alternatively, we can also find the maximum in the left sub-tree. In the example below, the value would be 8. It would be less than any of the values in the right sub-tree so therefore, the BST will still be preserved. In this tutorial, we will using the implementation for finding the minimum in the right sub-tree.

Delete node with two children: How to approach the problem

In the previous section, we established the fact we need to obtain the largest value in the left sub-tree. This would mean we would need to traverse to the right until the current node does not have a right child.

Let us consider the following example. Note that the example below is the same data set shown in the binary search tree implementation above. Just as in the diagram, we will be removing 10 in our example below. The left and right child are 8 and 12 respectively.

var removeTwoChildBst = BST(); removeTwoChildBst .add(6) .add(3) .add(5) .add(1) .add(10) .add(8) .add(7) .add(12) .add(11) .add(13); removeTwoChildBst.remove(10);

Logically speaking, if we have arrived at the minimum value, the current node will NOT have a left child. Why? Because if a node has a left child, that means that there is a value lesser than the current node right?

Conversely, if we iterate the left sub-tree and we want to find the maximum value, it will be the first element we come across that does NOT have a right child. Does this make sense? Fortunately, in the example above, we don’t have to look far. The root of the left sub-tree, 8, does not have a right child. This makes 8 the largest child in the left sub-tree, since it is the first node that does not have a right child.

Now that we identified the highest element in the left sub-tree, all we need to do is copy and set the value of Node 10 to 8. Afterwards, remove the original 8 and update the left-child reference of the new 8.

Delete node with two children: Examining the code

We start off at the root of the tree, which is 6. Since the data we want to remove (10) is greater than 6, we traverse to the right sub-tree, which has a root node of 10. Is 10 the data we are looking for? Yes! Therefore, we now need to take the following steps.

Replace with the maximum node in the left sub-tree, which in our example is 8. Set the current node (10) ‘s data to 8. Therefore, the value 10 is overwritten by 8. We now have a duplicate 8. Traverse the left sub-tree again and remove this duplicate 8. In the meanwhile, we will set the leftChild of the new node 8 to the left child of the left-sub-tree, which is 7.

console.log("removing a node with two children ..."); // 1. Get maximum node in the left sub-tree. var maxNodeLeftSubTree = getMaxNode(leftChild); // 2. Set the data of the current node to that of the max node in the leftSub-tree currentNode.data = maxNodeLeftSubTree.data; // 3. Remove the node in the left sub-tree with the maximum value. Otherwise, there will be two copies. console.log("Max node left sub-tree: " + maxNodeLeftSubTree.data); // Remove the largest node on the left sub-tree. var leftChild = removeNode.call(this, leftChild, maxNodeLeftSubTree.data); // Re-assign left-child appropriately currentNode.setLeftChild(leftChild);

I know I’ve said this many times, but I will say it again. The best way to learn is to download the code (available on GitHub) and run the code yourself. Make changes to it. Try writing a more efficient implementation. Anyway, I will leave the console output below.

traversing right:

removing a node with two children …

Max node left sub-tree: 8

remove a node with a left child

Data to link. Parent: 8. leftChild: 7

Current node: 8

Right removal data: 8

———————————————-

Current node: 6

Examining removeNode() method

In the implementation above, we take the following steps:

Set the variable maxNodeLeftSubTree to hold maximum node in the left sub-tree. Replace the data of the current node with the maxNodeLeftSubTree data. Afterwards, remove the largest node in the left sub-tree. The recursive call should return the leftChild , which we need to set for the current node.

In the implementation, the removal process could be refactored by being extracted into its own function, but for the sake of completeness, I have included all the removal logic in the function.

This code might be the subject of confusion, since it is called recursively to not only set the root node, but also the left and right children of the current node. If you are not up to par on recursion, make sure you understand how recursion works in JavaScript before proceeding.

What does the recursive method removeNode() return?

It returns the current node. At the top of the tree, the current node is always going to return the root node, hence we set

this.root = removeNode.call(this, this.root, dataToRemove)

Once again, before proceeding any further, I highly recommend you to download the entire source from GitHub while reading the post.

Note that the in removeNode() method, each of the call returns a reference to the second argument. On top of this, it removes the node that has value equal to the third parameter dataToRemove . This enables us to preserve the binary search tree. Remember what I said about each sub-tree being its own binary search tree? Using recursion enables us to break apart the big problem into smaller problems.

Binary Search Tree Implementation: the max() method

Now that we have gone through the difficult stuff, it should be smooth sailing from now on. Using the assumption of a binary search tree, fetch the node on the far right of the binary search tree. Why? Because the item with the highest value is on the far right! As with all the methods, you can choose to do this recursively or iteratively. If this seems like a walk in the park, congratulations. After all the hard work, you now understand binary search trees. Now without any further ado, lets jump into the implementation details.

/** * Get maximum node from a certain node by heading to the * far right of the binary search tree. * The node that doesn't have a right child is technically the maximum value. * */ function getMaxNode(currentNode) { var rightChild = currentNode.getRightChild(); if (rightChild != null) { currentNode = getMaxNode(rightChild); } return currentNode; }

Binary Search Tree Implementation: the min() method

This might seem so trivial to you now that it almost seems repetitive. But for the sake of completeness, I am going to repeat myself. Go and fetch the node on the far left of the binary search tree. Almost seems like a walk in the park right?

/** * Get minimum node from a certain node. * Logically speaking, to get the minimum value in a sub-tree, * we need to go to the far left. * Go far left via recursive call until there is no left child. * Return the node that doesn't have a left child. * */ function getMinNode(currentNode) { var leftChild = currentNode.getLeftChild(); if (leftChild != null) { currentNode = getMinNode(leftChild); } return currentNode; }

Conclusion and Binary Search Tree Implementation Source Code

Phew! This was probably the lengthiest post I have ever written to date. I am not much of a writer, but aspiring to become better at the art. Here is the Binary Search Tree implementation javascript source code! Hopefully, this will give the readers a better understanding of how a binary search tree implementation works under the hood.

Please note that the code can be optimized further, but I intentionally made the removeNode() function long so that the reader can focus on the logic instead of travelling to different function calls to read the entire logic.

I hope that this read was not only informative, but also fun to read. Once again, please let me know if there is anything else I can do to make the content more digestible and fun to read. Thank you very much for continuing to take time out of your schedules to read. See you soon in the next post and happy coding guys!