My guess on the performance is the following.

You have an index on "indexed_column" . . . this is just guess (based on the name). Postgres is using the indexed column for the sort. Furthermore, the table was created in increments, so the first million rows or so of the table are all on a contiguous set of pages in the database.

If this is true, then the following is happening. The sort is happily going to the index and finding the records it needs. It loads the page into memory, if the page is not already there. In almost all cases -- for the first million or so rows -- the pages are there, and the results are returned quickly.

After that, though, something bad happens. The index specifies a row, and the page with the row is probably not in memory. So, it has to go fetch the page, often replacing (flushing) one already in the cache. That is, each row reference basically requires a disk I/O.

By the way, this situation can occur on any table, even one that is not created in a particular fashion. However, getting a million rows is a lot of rows to get out before filling the cache, unless they are ordered.

Now, how can you fix the problem. The best way is to put the filtering logic into the database, where it belongs. After all, returning hundreds of millions of rows to an application is not a good use of a database. That is the first place I would look.

You could do something drastic and see if dropping the index and doing the actual sort for the order by is faster. If the above description is correct, then it is faster -- with the caveat that you will wait a long time for the first row.

The one thing you can do is to recreate the table, sorting the data by the indexed column, and then re-indexing it (that is, turning the column into a clustered index). This will improve things in the future, but that process will take a bit of time.

Two other possible things to try. Assuming you need only a subset of columns in the table, create a temporary table, and pull the data from there. With 400 million records, this will take a while, but, if the needed fields are small relative to the original records, it will improve performance.

Second, if you don't need the data in a particular order, then just pull the records without the order by. This should replace the index scan with a full table scan, which eliminates page thrashing.