To dive into this data structure, we will take up a question that will form the base of the article.

Problem:

You are given an immutable array ( an array that doesn’t change ). And a set of queries. Each query contains a range l to r. You need to calculate an idempotent function f over l to r, i.e. find the value of:

f(arr[l],arr[l+1],…arr[r])

An idempotent function is a function that does change its value when applied multiple times. Examples of such functions are min, max, gcd, lcm, etc.

min(min(2,3,4)) is same as min(2,3,4).

Solution:

There are several solutions possible for these types of questions. You can use a segment tree to solve the problem with N*log(N) preprocessing and log(N) query time.

But note that the array given here is immutable i.e. no updates are there on the given array. And the function given is idempotent. Due to these two conditions, we can use another powerful data structure that would help us solve this question in N*log(N) preprocessing and constant query time.

Intuition:

The sparse table is nothing but a special type of dp table. In this data structure, table[i][j] stores the value of function for the range [ i, i + 2^j).

Why the power of two and not any other number? Well, we all know that any number can be written as the sum of unique powers of 2. So any range can be expressed uniquely and as a sum of powers of 2 starting from l.

Let's take an example. Suppose the range given to you is 3 to 16. The size of the range is 14. 14 can be written as 8 + 4 + 2. So we can split the range as [3,10] + [11,14] + [15,16]. And we can get the values of these ranges directly from the table as table[3][3], table[11][2] and table[15][1]. We just need to combine them. We should note that from 14 calculations we have now only 3 calculations. So basically the sparse table can give your result in log(N) ( We will see ahead how to reduce this log(N) to constant for idempotent functions).

Building the table:

The construction of a sparse table is a very simple technique. You initialize the base of the table that is table[i][0] with the same values, and for other values, you build the table bottom-up:

table[i][j] = f( table[i][j-1], table[i+2^(j-1)][j-1] )

Basically the length of interval x if being divided into intervals of length x/2 and x/2. So l to l + x -1 is calculated from the already calculated value l to l + x/2 -1 and l+x/2 to l+x-1.

The actual implementation is as follows:

The cost of building the table would be N*log(N).

Answering the queries:

As we have already discussed, any range can be split up into a sum of unique powers of 2. Since we have already precalculated those values, we can easily combine the ranges to answer the query.

To get the answer, you start from the highest power of 2 and keep going down. If this power of two is less than the current size of the range, we can take this sub-range completely, change l to l + power of 2, and move on to the next power of two. Do this until you reach r.

The implementation would be something like this:

assuming the table stores the sum of ranges

Here the complexity of the query would be log(N)

Optimizing to constant time:

We can see that for idempotent functions, any query can be written as:

let x be the highest power of 2 that is less than or equal to the size of the range.

f ( l , r) = f( f( l, l+x), f( r-x, r) )

Note that some of the elements in the range would be overlapped. For example, a query for 3 to 16 would be split as 3 to 10 and 9 to 16. But min( arr[3]..arr[16]) would be same as min(min(arr[3]..arr[10]),min(arr[9]..arr[16])).

So our query reduces to just:

We can use the constant query time only for idempotent functions like min, max, gcd, etc. For others like sum, we need log(N) time for query processing.

Conclusion and Ending Notes:

The sparse table can be used to answer range queries, in constant time for idempotent functions and log(N) for other functions, with N*log(N) preprocessing time. The sparse table doesn’t support updates. For updates, we need to rebuild the entire table and it is advisable to switch to other data structures like segment tree. The sparse table is much much faster than segment trees.

I recommend solving this problem from practice using sparse tables. The editorial doesn’t mention the sparse table method properly.

Here is the link to my submission. Check it only if you are stuck.

Reference: