sorting - How does PostgreSQL perform ORDER BY with a b-tree index on the field?

Question

Welcome To Ask or Share your Answers For Others

sorting - How does PostgreSQL perform ORDER BY with a b-tree index on the field?

asked Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

sorting - How does PostgreSQL perform ORDER BY with a b-tree index on the field?

I have a table bsort:

CREATE TABLE bsort(a int, data text);

Here data may be incomplete. In other words, some tuples may not have data value.

And then I build a b-tree index on the table:

CREATE INDEX ON bsort USING BTREE(a);

Now if I perform this query:

SELECT * FROM bsort ORDER BY a;

Does PostgreSQL sort tuples with nlogn complexity, or does it get the order directly from the b-tree index?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-10-17T00:50:17+0000

For a simple query like this Postgres will use an index scan and retrieve readily sorted tuples from the index in order. Due to its MVCC model Postgres had to always visit the "heap" (data pages) additionally to verify entries are actually visible to the current transaction. Quoting the Postgres Wiki on index-only scans:

PostgreSQL indexes do not contain visibility information. That is, it is not directly possible to ascertain if any given tuple is visible to the current transaction, which is why it has taken so long for index-only scans to be implemented.

Which finally happened in version 9.2: index-only scans. The manual:

If the index stores the original indexed data values (and not some lossy representation of them), it is useful to support index-only scans, in which the index returns the actual data not just the TID of the heap tuple. This will only avoid I/O if the visibility map shows that the TID is on an all-visible page; else the heap tuple must be visited anyway to check MVCC visibility. But that is no concern of the access method's.

The visibility map decides whether index-only scans are possible. Only an option if all involved column values are included in the index. Else, the heap has to be visited (additionally) in any case. The sort step is still not needed.

That's why we sometimes append otherwise useless columns to indexes now. Like the data column in your example:

CREATE INDEX ON bsort (a, data);  -- btree is the default index type

It makes the index bigger (depends) and a bit more expensive to maintain and use for other purposes. So only append the data column if you get index-only scans out of it. The order of columns in the index is important:

Since Postgres 11, there are also "covering indexes" with the INCLUDE keyword. Like:

CREATE INDEX ON bsort (a) INCLUDE (data);

See:

Does a query with a primary key and foreign keys run faster than a query with just primary keys?

The benefit of an index-only scan, per documentation:

If it's known that all tuples on the page are visible, the heap fetch can be skipped. This is most noticeable on large data sets where the visibility map can prevent disk accesses. The visibility map is vastly smaller than the heap, so it can easily be cached even when the heap is very large.

The visibility map is maintained by VACUUM which happens automatically if you have autovacuum running (the default setting in modern Postgres). Details:

Are regular VACUUM ANALYZE still recommended under 9.1?

But there is some delay between write operations to the table and the next VACUUM run. The gist of it:

Read-only tables stay ready for index-only scans once vacuumed.
Data pages that have been modified lose their "all-visible" flag in the visibility map until the next VACUUM (and all older tansactions being finished), so it depends on the ratio between write operations and VACUUM frequency.

Partial index-only scans are still possible if some of the involved pages are marked all-visible. But if the heap has to be visited anyway, the access method "index scan" is a bit cheaper. So if too many pages are currently dirty, Postgres will switch to the cheaper index scan altogether. The Postgres Wiki again:

As the number of heap fetches (or "visits") that are projected to be needed by the planner goes up, the planner will eventually conclude that an index-only scan isn't desirable, as it isn't the cheapest possible plan according to its cost model. The value of index-only scans lies wholly in their potential to allow us to elide heap access (if only partially) and minimise I/O.

Categories

sorting - How does PostgreSQL perform ORDER BY with a b-tree index on the field?

sorting - How does PostgreSQL perform ORDER BY with a b-tree index on the field?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags