Someone asked me privately why we weren't immediately pursuing SQL

MERGE, as opposed to yet another non-standard variant of UPSERT

(UPSERT is loosely defined here as an insert-or-update DML command

that goes to update based on would-be unique constraint violations,

and does one or the other *atomically*). I believe at least one other

system (Teradata) that now has both formerly only had the latter. I

think it makes sense to have both. While I believe that it's already

generally accepted that SQL MERGE is distinct from UPSERT in fairly

fundamental ways, in that it solves a somewhat distinct set of

problems, I would like to put my thoughts on the matter on record.

This is mostly for the sake of the archives.

The first reason why the two are distinct is that MERGE, as

implemented within other major systems and as described by the SQL

standard does not require an atomic insert-or-update for even the

simplest possible case of OLTP style insert-or-update. As I went into

during my talk at pgCon, even these simple cases may throw errors on

other systems' MERGE implementations. This leeway to fail may be

valuable for some use cases though, like the bulk loading use case.

Simon Riggs has characterized MERGE as something that exists for the

benefit of those two fairly distinct use cases [1], and I agree with

him. I suspect that the predictability of MERGE's behavior suffers due

to trying to support both at once, and I think that MERGE should be

specifically promoted as a bulk loading command if we ever get around

to adding it to PostgreSQL.

The second reason they differ is that SQL MERGE as implemented within

other systems doesn't require that a unique index be defined on the

column or columns of the MERGE statement's outer join. I am not

enthusiastic about the idea of falling back to table-level locking

when an appropriate index isn't available. That could be a foot-gun

for the large majority of users that just want to solve their basic

insert-or-update problem. I did think to mention this when asked about

my work on UPSERT, but the next question was: "So why not just throw

an error when it's not possible to use an available unique index"?

This is a fair point, but doesn't address all the problems with trying

to make MERGE perform upserts that meet my requirements for UPSERT

[2](principally the requirement of atomicity in the sense of always

getting an insert or update).

These two points, particularly the first are likely to weigh upon the

implementation of both, which is why we cannot very well just insist

upon having appropriate unique indexes defined. It buys my UPSERT

implementation plenty to have the upsert driven by an insert into a

unique index. With UPSERT, having an insert occur is always an outcome

that the DML statement's author will consider acceptable. SQL MERGE

statements can be written without any WHEN NOT MATCHED THEN INSERT;

SQL MERGE statement authors are not necessarily going to expect an

insert at all.

As I outlined in my pgCon talk, there appears to be a fundamental

trade-off involved when implementing UPSERT. My implementation is

prepared for the possibility that upon finding a duplicate, and upon

subsequently attempting to lock a row, the row may be found to be

deleted. When that happens, we try again without having made any

useful progress in the first iteration. The ideal of a strict queue of

upserters, with totally fair row locking style lock arbitration

appears to be basically impossible, at least provided you're unwilling

to give up insert-or-update atomicity (that is, to get this you have

to be willing to give up with an error even at READ COMMITTED, which

is a cop out, or buy into using something like table level locks,

which is also a cop out). Theoretical lock starvation hazards exist in

my implementation because I don't do this, but all indications are

that this is fine in the real world. In this regard my implementation

is similar to the current looping upsert subtransaction pattern we've

been recommending for years [3]. We don't hold on to "value locks" on

values within indexes (except to finish staggered index-locks-first

inserting - but not to update/row lock).

This is all fine in the real world because *somebody* has to make

progress when there is highly contended upserting. Some session has to

insert-or-update successfully - otherwise you cannot have a conflict

to begin with. Livelocking is therefore not possible, and you need a

*sustained* perfect storm of conflicts starving out one session in

order to see lock starvation at all. SQL MERGE has an outer join on a

table of proposed values (as opposed to an inner join against already

rejected values). MERGE allows you to do something other than insert

WHEN NOT MATCHED (i.e. to do nothing), and I think that being able to

always insert at that juncture gives us a way to usefully terminate

the loop where there might not be another way.

Suppose you had a similar "keep retrying" implementation (similar to

what I've proposed for UPSERT), but built this for SQL MERGE. Ignoring

the fact that no other SQL MERGE implementation has

insert-or-update-or-delete atomicity, you insist upon this for your

implementation. Consider how this would behave with a MERGE statement

that had only a WHEN MATCHED THEN UPDATE clause and/or a WHEN MATCHED

THEN DELETE clause. You need to do nothing for the WHEN NOT MATCHED

case, but how exactly do you decide when and how to do nothing when

it's based on some index value *not* being there? This isn't the same

as INSERT IGNORE - it's the exact opposite. We're doing nothing

because there was no conflict found, *not* because there was some

conflict conclusively found. Furthermore, not only are we doing

nothing, we're specifically not doing something else (updating). We're

not doing updating in a way predicated on the physical absence of a

value from an index at one instant in time. But what on earth does

that absence tell us? In what sense is the value not there? Is it in

the heap, but not yet in any index? Is it visible to our estate

snapshot? My sense is that any attempt to resolve these questions

would be extremely ticklish, and would likely introduce livelock

hazards or deadlock hazards. You end up with head melting

Heisenberg-esque questions when you try to impose this requirement on

SQL MERGE. AFAICT that's why no one else has succeeded in doing so. I

think you're just supposed to live with those problems ("Nothing

visible to my MVCC snapshot? Good enough, do WHEN NOT MATCHED handling

then"). A lot of the time these problems don't come up.

To state it another way, my implementation relies on something

*conclusively* being there (conclusively committed and visible to new

snapshots) - our new index tuple, or someone else's existing one. If

it's the latter case we're finally done when we conclusively lock the

tuple without a conflict (that is, without it being deleted in the

interim). We rely on things happening conclusively (things that have

locking implications, like inserting an index tuple or locking the

row). We cannot conclusively conclude that something is not there

without extreme measures that are very questionable.

At a high level SQL MERGE is quite distinct from UPSERT, in that it is

a utility command that performs inserts, updates and deletes while

avoiding race conditions (e.g. unique constraint violations) on a more

or less best effort basis. MERGE is conceptually messy. In contrast

UPSERT is actually atomic, and having its behavior be relatively easy

to reason about ought to be the top priority. There is a *really* big

demand for UPSERT from users, not MERGE, although MERGE is certainly

useful too.

[1] http://www.postgresql.org/message-id/1132000496.3388.31.camel@holly

[2] http://www.pgcon.org/2014/schedule/attachments/327_upsert_weird.pdf,

Slide 8, "Goals for UPSERT in Postgres"

[3] http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-UPSERT-EXAMPLE

--

Peter Geoghegan