Translations 中文 对索引Include子句的深入分析

Some database—namely Microsoft SQL Server, IBM Db2, and also PostgreSQL since release 11—offer an include clause in the create index statement. The introduction of this feature to PostgreSQL is the trigger for this long overdue explanation of the include clause.

Before going into the details, let’s start with a short recap on how (non-clustered) B-tree indexes work and what the all-mighty index-only scan is.

Contents:

Recap: B-tree Indexes

To understand the include clause, you must first understand that using an index affects up to three layers of data structures:

The B-tree

The doubly linked list at the leaf node level of the B-tree

The table

The first two structures together form an index so they could be combined into a single item, i.e. the “B-tree index”. I prefer to keep them separate as they serve different needs and have a different impact on performance. Moreover, explaining the include clause requires making this distinction.

TABLE DATA... TABLE DATA... KEY KEY KEY KEY Table Index Doubly linked list

In the general case, the database software starts traversing the B-tree to find the first matching entry at the leaf node level (1). It then follows the doubly linked list until it has found all matching entries (2) and finally it fetches each of those matching entries from the table (3). Actually, the last two steps can be interleaved, but that is not relevant for understanding the general concept.

TABLE DATA... TABLE DATA... KEY KEY KEY KEY ❶ ❷ ❸ Table Index Doubly linked list

The following formulas give you a rough idea of how many read operations each of these steps needs. The sum of these three components is the total effort of an index access.

The B-tree: log 100 (<rows in table>), often less than 5

The doubly linked list: <rows read from index> / 100

The table: <rows read from table>

When loading a few rows, the B-tree makes the greatest contribution to the overall effort. As soon as you need to fetch just a handful of rows from the table, this step takes the lead. In either case—few or many rows—the doubly linked list is usually a minor factor because it stores rows with similar values next to each other so that a single read operation can fetch 100 or even more rows. The formula reflects this by the respective divisor.

Note If you are thinking “That’s why we have clustered indexes”, please read my article: “Unreasonable Defaults: Primary Key as Clustering Key”.

The most generic idea about optimization is to do less work to achieve the same goal. When it comes to index access, this means that the database software omits accessing a data structure if it doesn’t need any data from it.

You can read more about the inner workings of B-tree indexes in Chapter 1, “Anatomy of an SQL Index” of SQL Performance Explained.

Recap: Index-Only Scan

The index-only scan does exactly that: it omits the table access if the required data is available in the doubly linked list of the index.

Consider the following index and query I borrowed from “Index-Only Scan: Avoiding Table Access”.

CREATE INDEX idx ON sales ( subsidiary_id, eur_value )

SELECT SUM(eur_value) FROM sales WHERE subsidiary_id = ?

At first glance, you may wonder why the column eur_value is in the index definition at all—it is not mentioned in the where clause.

B-tree Indexes Help Many Clauses It is a common misconception that indexes only help the where clause. B-tree indexes can also help the order by , group by , select and other clauses. It is just the B-tree part of an index—not the doubly linked list—that cannot be used by other clauses.

The crucial point in this example is that the B-tree index happens to have all required columns—the database software doesn’t need to access the table itself. This is what we refer to as an index-only scan.

TABLE DATA... TABLE DATA... KEY KEY KEY KEY ❶ ❷ Table Index Doubly linked list

Applying the formulas above, the performance benefit of this is very small if only a few rows satisfy the where clause. On the other hand, if the where clause accepts many rows, e.g. millions, the number of read operations is essentially reduced by a factor of 100.

Note It is not uncommon that an index-only scan improves performance by one or two orders of magnitude.

The example above uses the fact that the doubly-linked list—the leaf nodes of the B-tree—contains the eur_value column. Although the other nodes of the B-tree store that column too, this query has no use for the information in these nodes.

The Include Clause

The include clause allows us to make a distinction between columns we would like to have in the entire index (key columns) and columns we only need in the leaf nodes ( include columns). That means it allows us to remove columns from the non-leaf nodes if we don’t need them there.

Using the include clause, we could refine the index for this query:

CREATE INDEX idx ON sales ( subsidiary_id ) INCLUDE ( eur_value )

The query can still use this index for an index-only scan, thus yielding essentially the same performance.

KEY KEY KEY KEY INCLUDE INCLUDE INCLUDE INCLUDE KEY KEY KEY KEY KEY KEY KEY KEY KEY KEY KEY TABLE DATA... TABLE DATA... Table Index Doubly linked list

Besides the obvious differences in the picture, there is also a more subtle difference: the order of the leaf node entries does not take the include columns into account. The index is solely ordered by its key columns. This has two consequences: include columns cannot be used to prevent sorting nor are they considered for uniqueness (see below).

“Covering Index” The term “covering index” is sometimes used in the context of index-only scans or include clauses. As this term is often used with a different meaning, I generally avoid it. What matters is whether a given index can support a given query by means of an index-only scan. Whether or not that index has an include clause or contains all table columns is not relevant.

Compared to the original index definition, the new definition with the include clause has some advantages:

The tree might have fewer levels (<~40%) As the tree nodes above the doubly linked list do not contain the include columns, the database can store more branches in each block so that the tree might have fewer levels.

The index is slightly smaller (<~3%) As the non-leaf nodes of the tree don’t contain include columns, the overall size of that index is slightly less. However, the leaf node level of the index needs the most space anyway so that the potential savings in the remaining nodes is very little.

It documents its purpose This is definitely the most underestimated benefit of the include clause: the reason why the column is in the index is document in the index definition itself.

Let me elaborate on the last item.

When extending an existing index, it is very important to know exactly why the index is currently defined the way it happens to be defined. The freedoms you have in changing the index without breaking any other queries is a direct result of this knowledge.

The following query demonstrates this:

SELECT * FROM sales WHERE subsidiary_id = ? ORDER BY ts DESC FETCH FIRST 1 ROW ONLY

As before, for a given subsidiary this query fetches the most recent sales entry ( ts is for time stamp).

To optimize this query, it would be great to have an index that starts with the key columns (subsidiary_id, ts) . With this index, the database software can directly navigate to the latest entry for that subsidiary and return it right away. There is no need to read and sort all of the entries for that subsidiary because the doubly linked list is sorted according to the index key, i.e. the last entry for any given subsidiary must have the greatest ts value for that subsidiary. With this approach, the query is essentially as fast as a primary key lookup. See “Indexing Order By” and “Querying Top-N Rows” for more details about this technique.

On my Own Behalf I make my living from training, other SQL related services and selling my book. Learn more at https://winand.at/.

Before adding a new index for this query, we should check if there is an existing index that can be changed (extended) to support this trick. This is generally a good practice because extending an existing index has a smaller impact on the maintenance overhead than adding a new index. However, when changing an existing index, we need to make sure that we do not make that index less useful for other queries.

If we look at the original index definition, we encounter a problem:

CREATE INDEX idx ON sales ( subsidiary_id, eur_value )

To make this index support the order by clause of the above query, we would need to insert the ts column between the two existing columns:

CREATE INDEX idx ON sales ( subsidiary_id, ts, eur_value )

However, that might render this index less useful for queries that need the eur_value column in the second position, e.g. if it was in the where or order by clause. Changing this index involves a considerable risk: breaking other queries unless we know that there are no such queries. If we don’t know, it is often best to keep the index as it is and create another one for the new query.

The picture changes completely if we look at the index with the include clause.

CREATE INDEX idx ON sales ( subsidiary_id ) INCLUDE ( eur_value )

As the eur_value column is in the include clause, it is not in the non-leaf nodes and thus neither useful for navigating the tree nor for ordering. Adding a new column to the end of the key part is relatively safe.

CREATE INDEX idx ON sales ( subsidiary_id, ts ) INCLUDE ( eur_value )

Even though there is still a small risk of negative impacts for other queries, it is usually worth taking that risk.

From the perspective of index evolution, it is thus very helpful to put columns into the include clause if this is all you need. Columns that are just added to enable an index-only scan are the prime candidates for this.

Filtering on Include Columns

Until now we have focused on how the include clause can enable index-only scans. Let’s also look at another case where it is beneficial to have an extra column in the index.

SELECT * FROM sales WHERE subsidiary_id = ? AND notes LIKE '%search term%'

I’ve made the search term a literal value to show the leading and trailing wildcards—of course you would use a bind parameter in your code.

Now, let’s think about the right index for this query. Obviously, the subsidiary_id needs to be in the first position. If we take the previous index from above, it already satisfies this requirement:

CREATE INDEX idx ON sales ( subsidiary_id, ts ) INCLUDE ( eur_value )

The database software can use that index with the three-step procedure as described at the beginning: (1) it will use the B-tree to find the first index entry for the given subsidiary; (2) it will follow the doubly linked list to find all sales for that subsidiary; (3) it will fetch all related sales from the table, remove those for which the like pattern on the notes column doesn’t match and return the remaining rows.

The problem is the last step of this procedure: the table access loads rows without knowing if they will make it into the final result. Quite often, the table access is the biggest contributor to the total effort of running a query. Loading data that is not even selected is a huge performance no-no.

Important Avoid loading data that doesn’t affect the result of the query.

The challenge with this particular query is that it uses an in-fix like pattern. Normal B-tree indexes don’t support searching such patterns. However, B-tree indexes still support filtering on such patterns. Note the emphasis: searching vs. filtering.

In other words, if the notes column was present in the doubly linked list, the database software could apply the like pattern before fetching that row from the table (not PostgreSQL, see below). This prevents the table access if the like pattern doesn’t match. If the table has more columns, there is still a table access to fetch those columns for the rows that satisfy the where clause—due to the select * .

CREATE INDEX idx ON sales ( subsidiary_id, ts ) INCLUDE ( eur_value, notes )

If there are more columns in the table, the index does not enable an index-only scan. Nonetheless, it can bring the performance close to that of an index-only scan if the portion of rows that match the like pattern is very low. In the opposite case—if all rows match the pattern—the performance is a little bit worse due to the increased index size. However, the breakeven is easy to reach: for an overall performance improvement, it is often enough that the like filter removes a small percentage of the rows. Your mileage will vary depending on the size of the involved columns.

Unique Indexes with Include Clause

Last but not least there, is an entirely different aspect of the include clause: unique indexes with an include clause only consider the key columns for the uniqueness.

That allows us to create unique indexes that have additional columns in the leaf nodes, e.g. for an index-only scan.

CREATE UNIQUE INDEX … ON … ( id ) INCLUDE ( payload )

This index protects against duplicate values in the id column, yet it supports an index-only scan for the next query.

SELECT payload FROM … WHERE id = ?

Note that the include clause is not strictly required for this behavior: databases that make a proper distinction between unique constraints and unique indexes just need an index with the unique key columns as the leftmost columns—additional columns are fine.

For the Oracle Database, the corresponding syntax is this:

CREATE INDEX … ON … ( id, payload )

ALTER TABLE … ADD UNIQUE ( id ) USING INDEX …

Compatibility

PostgreSQL: No Filtering Before Visibility Check

The PostgreSQL database has a limitation when it comes to applying filters on the index level. The short story is that it doesn’t do it, except in a few cases. Even worse, some of those cases only work when the respective data is stored in the key part of the index, not in the include clause. That means moving columns to the include clause may negatively affect performance, even if the above described logic still applies.

The long story starts with the fact that PostgreSQL keeps old row versions in the table until they become invisible to all transactions and the vacuum process removes them at some later point in time. To know whether a row version is visible (to a given transaction) or not, each table has two extra attributes that indicate when a row version was created and deleted: xmin and xmax . The row is only visible if the current transaction falls within the xmin / xmax range.

Unfortunately, the xmin / xmax values are not stored in indexes.

That means that whenever PostgreSQL is looking at an index entry, it cannot tell whether or not that entry is visible to the current transaction. It could be a deleted entry or an entry that has not yet been committed. The canonical way to find out is to look into the table and check the xmin / xmax values.

A consequence is that there is no such thing as an index-only scan in PostgreSQL. No matter how many columns you put into an index, PostgreSQL will always need to check the visibility, which is not available in the index.

Yet there is an Index Only Scan operation in PostgreSQL—but that still needs to check the visibility of each row version by accessing data outside the index. Instead of going to the table, the Index Only Scan first checks the so-called visibility map. This visibility map is very dense so the number of read operations is (hopefully) less than fetching xmin / xmax from the table. However, the visibility map does not always give a definite answer: the visibility map either states that that the row is known to be visible, or that the visibility is not known. In the latter case, the Index Only Scan still needs to fetch xmin / xmax from the table (shown as “Heap Fetches” in explain analyze ).

After this short visibility digression, we can return to filtering on the index level.

SQL allows arbitrary complex expressions in the where clause. These expressions might also cause runtime errors such as “division by zero”. If PostgreSQL would evaluate such expression before confirming the visibility of the respective entry, even invisible rows could cause such errors. To prevent this, PostgreSQL generally checks the visibility before evaluating such expressions.

There is one exception to this general rule. As the visibility cannot be checked while searching an index, operators that can be used for searching must always be safe to use. These are the operators that are defined in the respective operator class. If a simple comparison filter uses an operation from such an operator class, PostgreSQL can apply that filter before checking the visibility because it knows that these operators are safe to use. The crux is that only key columns have an operator class associated with them. Columns in the include clause don’t—filters based on them are not applied before their visibility is confirmed. This is my understanding from a thread on the PostgreSQL hackers mailing list.

For a demonstration, take the previous index and query:

CREATE INDEX idx ON sales ( subsidiary_id, ts ) INCLUDE ( eur_value, notes )

SELECT * FROM sales WHERE subsidiary_id = ? AND notes LIKE '%search term%'

The execution plan—edited for brevity—could look like this:

QUERY PLAN ---------------------------------------------- Index Scan using idx on sales (actual rows=16) Index Cond: (subsidiary_id = 1) Filter: (notes ~~ '%search term%') Rows Removed by Filter: 240 Buffers: shared hit=54

The like filter is shown in Filter , not in Index Cond . That means it was applied at table level. Also, the number of shared hits is rather high for fetching 16 rows.

In a Bitmap Index/Heap Scan the phenomenon becomes more obvious.

QUERY PLAN ----------------------------------------------- Bitmap Heap Scan on sales (actual rows=16) Recheck Cond: (idsubsidiary_id= 1) Filter: (notes ~~ '%search term%') Rows Removed by Filter: 240 Heap Blocks: exact=52 Buffers: shared hit=54 -> Bitmap Index Scan on idx (actual rows=256) Index Cond: (subsidiary_id = 1) Buffers: shared hit=2

The Bitmap Index Scan does not mention the like filter at all. Instead it returns 256 rows—way more than the 16 that satisfy the where clause.

Note that this is not a particularity of the include column in this case. Moving the include columns into the index key gives the same result.

CREATE INDEX idx ON sales ( subsidiary_id, ts, eur_value, notes)

QUERY PLAN ----------------------------------------------- Bitmap Heap Scan on sales (actual rows=16) Recheck Cond: (subsidiary_id = 1) Filter: (notes ~~ '%search term%') Rows Removed by Filter: 240 Heap Blocks: exact=52 Buffers: shared hit=54 -> Bitmap Index Scan on idx (actual rows=256) Index Cond: (subsidiary_id = 1) Buffers: shared hit=2

This is because the like operator is not part of the operator class so it is not considered to be safe.

If you use an operation from the operator class, e.g. equals, the execution plan changes.

SELECT * FROM sales WHERE subsidiary_id = ? AND notes = 'search term'

The Bitmap Index Scan now applies all conditions from the where clause and only passes the remaining 16 rows on to the Bitmap Heap Scan .

QUERY PLAN ---------------------------------------------- Bitmap Heap Scan on sales (actual rows=16) Recheck Cond: (subsidiary_id = 1 AND notes = 'search term') Heap Blocks: exact=16 Buffers: shared hit=18 -> Bitmap Index Scan on idx (actual rows=16) Index Cond: (subsidiary_id = 1 AND notes = 'search term') Buffers: shared hit=2

Note that this requires the respective column to be a key column. If you move the notes column back to the include clause, it has no associated operator class so the equals operator is not considered safe anymore. Consequently, PostgreSQL postpones applying this filter to the table access until after the visibility is checked.