Segment tree is a very popular data structure that helps us solve questions on range queries along with updates. We will explore the segment trees with the help of the following example.

Question :

You are given an array, arr[0..n-1] ( size is n, and 0 indexed ). You have to answer multiple queries, where each query can be of two types:

Given (l , r) find the maximum subarray sum in the given range (inclusive). Given x and y, set arr[x] = y

Intuition :

The segment tree, as the name suggests is a tree that stores information about the segments. The tree starts with the topmost segment which contains the whole segment ( stores the information about the whole segment ). Each node has two children the left one which contains the left half of the information ( l … mid ) and the right one which contains the right half of the information ( mid+1 … r ). The following diagram will help in understanding this concept more clearly :

Segment Tree for an array of size 10. The topmost node contains information about the complete range 0..9. The figure shows how the range is getting halved after each level.

It is now clear how the segments are getting divided and the relationship between parent-child nodes. When a query is asked only the necessary nodes are visited. An example is a query about range (2, 8) is asked then the following nodes would be visited:

Green tells the nodes that are just visited and red tells that the complete information of the node is required

Here 2–2 + 3–4+5–7 + 8–8 make the complete range 2–8. This is the basic idea behind the segment trees.

Implementation:

Now that we know how the segment tree is formed, how do we solve the given problem?

Firstly we will decide what all information about a segment do we need to solve the above problem. We have to keep in mind that, given two segments we should be able to merge them based on the information they have and completely (and correctly) derive the combined segment.

For this question, we will store four values for any segment:

Maximum prefix sum Maximum suffix sum Total sum The maximum subarray sum in the segment

Based on this value we can derive the information about the parent node by:

The maximum prefix sum of the parent would be the maximum of (maximum prefix sum of the left node, the total sum of left node + maximum prefix sum of the right node ) The maximum suffix sum of the parent would be the maximum of (maximum suffix sum of the right node, the total sum of right node + maximum suffix sum of the left node ) The total sum would be the sum of the total sum of both nodes The maximum sum would be the max of the maximum sum of both child nodes and the prefix and suffix sum of the parent.

Now we know how the nodes would be merged we need to know how will they be stored effectively.

The segment tree is stored in an array with the root starting at 0. The left child of the ith node is stored at 2*i+1 and the right child is stored at 2*i+2.

Building:

The building of the tree is a simple recursive process. First, build the child nodes, then combine them to get the parent node. The code is:

The build function takes three parameters, ss — segment start, se — segment end and si-segment index ( the index of the segment in the tree ). The merge function is a helper function that combines the two nodes to get the parent node from the child nodes. The create function is a function that helps in creating a new node when we reach a leaf.

The complexity of build is O(N) because the tree has ~ 2*N nodes and each node needs only constant time for computation.

Query:

The steps to perform a query is pretty simple too.

Check if this node has any common intersection with the query interval if no intersection is there then return ( no need to traverse any further down this road ). If this node is entirely inside the query interval then return this node completely ( no need to traverse any further down this road too). If the node has some intersection then, repeat the process for the left and right child. Merge the answer you got from them and return.

The code for this is:

The complexity of the query is O(logN). In the worst case, it would travel through the whole height ( the height of the tree is logN, since the range is getting halved after every level ). It is important to note that the constant factor is approximately 4, the traversal takes about 4 different branches till the leaf in the worst case ( in other cases it return midway as soon as it finds the desired interval or if it doesn’t). The proof is explained here.

Update:

The last part remains, i.e changing the value at an index. For an update, we traverse in exactly the same manner as the query.

The query given has point updates, if the range update is given then this way of updating would not be feasible.

The time complexity for point update is O(logN), only one element is being updated, only one path would be traced till the leaf.

For range updates, the complexity of the update would become O(NlogN) because the leaf node would be traced for O(N) different elements. For range updates, lazy propagation is used.

To read about lazy propagation.

Conclusion

The Segment Tree requires O(2*N), memory ( wrote the 2 so that you don’t have segfaults, it always safe to initialize the size as 4*N ) The build takes O(N) whereas query and point update takes O(logN). Segment tree can help in solving any range query question if you figure out what information to store and how to merge the segments correctly. Segment tree has a higher constant factor and hence slower than BIT trees, but they are more powerful as they can be used to solve complex questions which would impossible using BIT trees ( example: range minimum query)

Do try the SPOJ’s GSS1, GSS3 any other segment tree question.

PS: If you wanna check out about bit trees :)