SQL queries with an order by clause do not need to sort the result explicitly if the relevant index already delivers the rows in the required order. That means the same index that is used for the where clause must also cover the order by clause.

As an example, consider the following query that selects yesterday’s sales ordered by sale date and product ID:

SELECT sale_date, product_id, quantity FROM sales WHERE sale_date = TRUNC(sysdate) - INTERVAL '1' DAY ORDER BY sale_date, product_id

There is already an index on SALE_DATE that can be used for the where clause. The database must, however, perform an explicit sort operation to satisfy the order by clause:

DB2 Explain Plan ------------------------------------------------------------ ID | Operation | Rows | Cost 1 | RETURN | | 682 2 | TBSCAN | 394 of 394 (100.00%) | 682 3 | SORT | 394 of 394 (100.00%) | 682 4 | FETCH SALES | 394 of 394 (100.00%) | 682 5 | IXSCAN SALES_DATE | 394 of 1009326 ( .04%) | 19 Predicate Information 5 - START (Q1.SALE_DATE = (CURRENT DATE - 1 DAYS)) STOP (Q1.SALE_DATE = (CURRENT DATE - 1 DAYS)) The where -clause was changed like this to be DB2 compliant: WHERE sale_date > CURRENT_DATE - 1 DAY . Oracle --------------------------------------------------------------- |Id | Operation | Name | Rows | Cost | --------------------------------------------------------------- | 0 | SELECT STATEMENT | | 320 | 18 | | 1 | SORT ORDER BY | | 320 | 18 | | 2 | TABLE ACCESS BY INDEX ROWID| SALES | 320 | 17 | |*3 | INDEX RANGE SCAN | SALES_DATE | 320 | 3 | ---------------------------------------------------------------

An INDEX RANGE SCAN delivers the result in index order anyway. To take advantage of this fact, we just have to extend the index definition so it corresponds to the order by clause:

DROP INDEX sales_date

CREATE INDEX sales_dt_pr ON sales (sale_date, product_id)

SELECT sale_date, product_id, quantity FROM sales WHERE sale_date = TRUNC(sysdate) - INTERVAL '1' DAY ORDER BY sale_date, product_id

DB2 Explain Plan ----------------------------------------------------------- ID | Operation | Rows | Cost 1 | RETURN | | 688 2 | FETCH SALES | 394 of 394 (100.00%) | 688 3 | IXSCAN SALES_DT_PR | 394 of 1009326 ( .04%) | 24 Predicate Information 3 - START (Q1.SALE_DATE = (CURRENT DATE - 1 DAYS)) STOP (Q1.SALE_DATE = (CURRENT DATE - 1 DAYS)) Oracle --------------------------------------------------------------- |Id | Operation | Name | Rows | Cost | --------------------------------------------------------------- | 0 | SELECT STATEMENT | | 320 | 300 | | 1 | TABLE ACCESS BY INDEX ROWID| SALES | 320 | 300 | |*2 | INDEX RANGE SCAN | SALES_DT_PR | 320 | 4 | ---------------------------------------------------------------

The sort operation SORT ORDER BY disappeared from the execution plan even though the query still has an order by clause. The database exploits the index order and skips the explicit sort operation.

Important If the index order corresponds to the order by clause, the database can omit the explicit sort operation.

Even though the new execution plan has fewer operations, the cost value has increased considerably because the clustering factor of the new index is worse (see “Automatically Optimized Clustering Factor”). At this point, it should just be noted that the cost value is not always a good indicator of the execution effort.

Automatically Optimized Clustering Factor The Oracle database keeps the clustering factor at a minimum by considering the ROWID for the index order. Whenever two index entries have the same key values, the ROWID decides upon their final order. The index is therefore also ordered according to the table order and thus has the smallest possible clustering factor because the ROWID represents the physical address of table row. By adding another column to an index, you insert a new sort criterion before the ROWID . The database has less freedom in aligning the index entries according to the table order so the index clustering factor can only get worse. Regardless, it is still possible that the index order roughly corresponds to the table order. The sales of a day are probably still clustered together in the table as well as in the index—even though their sequence is not exactly the same anymore. The database has to read the table blocks multiple times when using the SALE_DT_PR index—but these are just the same table blocks as before. Due to the caching of frequently accessed data, the performance impact could be considerably lower than indicated by the cost values.

For this optimization, it is sufficient that the scanned index range is sorted according to the order by clause. Thus the optimization also works for this particular example when sorting by PRODUCT_ID only:

SELECT sale_date, product_id, quantity FROM sales WHERE sale_date = TRUNC(sysdate) - INTERVAL '1' DAY ORDER BY product_id

In Figure 6.1 we can see that the PRODUCT_ID is the only relevant sort criterion in the scanned index range. Hence the index order corresponds to the order by clause in this index range so that the database can omit the sort operation.

Figure 6.1 Sort Order in the Relevant Index Range

This optimization can cause unexpected behavior when extending the scanned index range:

SELECT sale_date, product_id, quantity FROM sales WHERE sale_date >= TRUNC(sysdate) - INTERVAL '1' DAY ORDER BY product_id

This query does not retrieve yesterday’s sales but all sales since yesterday. That means it covers several days and scans an index range that is not exclusively sorted by the PRODUCT_ID . If we look at Figure 6.1 again and extend the scanned index range to the bottom, we can see that there are again smaller PRODUCT_ID values. The database must therefore use an explicit sort operation to satisfy the order by clause.

DB2 Explain Plan ------------------------------------------------------------- ID | Operation | Rows | Cost 1 | RETURN | | 688 2 | TBSCAN | 394 of 394 (100.00%) | 688 3 | SORT | 394 of 394 (100.00%) | 688 4 | FETCH SALES | 394 of 394 (100.00%) | 688 5 | IXSCAN SALES_DT_PR | 394 of 1009326 ( .04%) | 24 Predicate Information 5 - START ((CURRENT DATE - 1 DAYS) <= Q1.SALE_DATE) Oracle --------------------------------------------------------------- |Id |Operation | Name | Rows | Cost | --------------------------------------------------------------- | 0 |SELECT STATEMENT | | 320 | 301 | | 1 | SORT ORDER BY | | 320 | 301 | | 2 | TABLE ACCESS BY INDEX ROWID| SALES | 320 | 300 | |*3 | INDEX RANGE SCAN | SALES_DT_PR | 320 | 4 | ---------------------------------------------------------------

If the database uses a sort operation even though you expected a pipelined execution, it can have two reasons: (1) the execution plan with the explicit sort operation has a better cost value; (2) the index order in the scanned index range does not correspond to the order by clause.

On my Own Behalf I make my living from training, other SQL related services and selling my book. Learn more at https://winand.at/.

A simple way to tell the two cases apart is to use the full index definition in the order by clause—that means adjusting the query to the index in order to eliminate the second cause. If the database still uses an explicit sort operation, the optimizer prefers this plan due to its cost value; otherwise the database cannot use the index for the original order by clause.

Tip Use the full index definition in the order by clause to find the reason for an explicit sort operation.

In both cases, you might wonder if and how you could possibly reach a pipelined order by execution. For this you can execute the query with the full index definition in the order by clause and inspect the result. You will often realize that you have a false perception of the index and that the index order is indeed not as required by the original order by clause so the database cannot use the index to avoid a sort operation.