Postgres ships in-core data types for JSON with specific functions and operators (json since 9.2, and jsonb which is a binary representation since 9.4). The upcoming Postgres 12 is becoming more compliant with the SQL specifications by introducing SQL/JSON path language, introduced mainly by the following commit:

commit: 72b6460336e86ad5cafd3426af6013c7d8457367 author: Alexander Korotkov <akorotkov@postgresql.org> date: Sat, 16 Mar 2019 12:15:37 +0300 Partial implementation of SQL/JSON path language SQL 2016 standards among other things contains set of SQL/JSON features for JSON processing inside of relational database. The core of SQL/JSON is JSON path language, allowing access parts of JSON documents and make computations over them. This commit implements partial support JSON path language as separate datatype called "jsonpath". The implementation is partial because it's lacking datetime support and suppression of numeric errors. Missing features will be added later by separate commits. Support of SQL/JSON features requires implementation of separate nodes, and it will be considered in subsequent patches. This commit includes following set of plain functions, allowing to execute jsonpath over jsonb values: * jsonb_path_exists(jsonb, jsonpath[, jsonb, bool]), * jsonb_path_match(jsonb, jsonpath[, jsonb, bool]), * jsonb_path_query(jsonb, jsonpath[, jsonb, bool]), * jsonb_path_query_array(jsonb, jsonpath[, jsonb, bool]). * jsonb_path_query_first(jsonb, jsonpath[, jsonb, bool]). This commit also implements "jsonb @? jsonpath" and "jsonb @@ jsonpath", which are wrappers over jsonpath_exists(jsonb, jsonpath) and jsonpath_predicate(jsonb, jsonpath) correspondingly. These operators will have an index support (implemented in subsequent patches). Catversion bumped, to add new functions and operators. Code was written by Nikita Glukhov and Teodor Sigaev, revised by me. Documentation was written by Oleg Bartunov and Liudmila Mantrova. The work was inspired by Oleg Bartunov. Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com Author: Nikita Glukhov, Teodor Sigaev, Alexander Korotkov, Oleg Bartunov, Liudmila Mantrova Reviewed-by: Tomas Vondra, Andrew Dunstan, Pavel Stehule, Alexander Korotkov

The documentation can be looked at in details for all the additions, but here is a short description of each concept introduced. Note that there are many operators and features part of what has been committed, so only a very small part is presented here.

First, one needs to know about some expressions, which are similar to XPath for XML data to do lookups and searches into different parts of a JSON object. Let’s take a sample of data, so here is a JSON blob representing a character in an RPG game (this should be normalized, but who cares here):

=# CREATE TABLE characters (data jsonb); CREATE TABLE =# INSERT INTO characters VALUES (' { "name" : "Iksdargotso", "id" : 1, "sex" : "male", "hp" : 300, "level" : 10, "class" : "warrior", "equipment" : { "rings" : [ { "name" : "ring of despair", "weight" : 0.1 }, {"name" : "ring of strength", "weight" : 2.4 } ], "arm_right" : "Sword of flame", "arm_left" : "Shield of faith" } }');

The basic grammar of those expressions is to use the keys part of the JSON objects combined with some elements:

Dots to move into a tree

Brackets for access to a given array member coupled with a position.

Variables, with ‘$’ representing a JSON text and ‘@’ for result path evaluations.

Context variables, which are basically references with ‘$’ and a variable name, with values that can be passed down to dedicated functions.

So for example, when applied to the previous JSON data sample we can reach the following parts of the tree with these expressions:

$.level refers to 10.

$.equipment.arm_left refers to “Shield of faith”.

$.equipment.rings refers to the full array of rings.

$.equipment.rings[0] refers to the first ring listed in the previous array (contrary to arrays members are zero-based).

Then comes the second part. These expressions are implemented using a new datatype called jsonpath, which is a binary representation of the parsed SQL/JSON path. This data type has its own parsing rules defined as of src/backend/utils/adt/jsonpath_gram.y parsing the data into a tree of several JsonPathParseItem items. After knowing about that comes the actual fun. Because, combining a jsonpath, a jsonb blob and the new set of functions implemented, it is possible to do some actual lookups in the JSON blob. jsonb_path_query() is likely the most interesting one, as it allows to directly query a portion of the JSON blob:

=# SELECT jsonb_path_query(data, '$.name') FROM characters; name --------------- "Iksdargotso" (1 row) =# SELECT jsonb_path_query(data, '$.equipment.rings[0].name') AS ring_name FROM characters; ring_name ------------------- "ring of despair" (1 row)

Note as well that there is some wildcard support, for example with an asterisk which returns all the elements of a set:

=# SELECT jsonb_path_query(data, '$.equipment.rings[0].*') AS data FROM characters; name ------------------- "ring of despair" 0.1 (2 rows)

New operators are also available and these allow for much more complex operations. One possibility is that it is possible to apply some functions within a result set as part of the expression. Here is for example how to apply floor() for a integer conversion for the weight of all the rings:

=# SELECT jsonb_path_query(data, '$.equipment.rings[*].weight.floor()') AS weight FROM characters; weight -------- 0 2 (2 rows)

This is actually only the top of cake, because one can do much more advanced context-related lookups for a JSON blob. For example you can apply a filter on top of it and fetch only a portion of them. Here is for example a way to get the names of all rings for a character which are heavier than 1kg (I am afraid that the unit is true as this applies to a ring of strength after all):

=# SELECT jsonb_path_query(data, '$.equipment.rings[*] ? (@.weight > 1)')->'name' AS name FROM characters; name -------------------- "ring of strength" (1 row)

Note that all the most basic comparison operators are implemented and listed in the documentation, so there is a lot of fun ahead. Due to time constraints, not all the features listed in the specification have been implemented as datetime is for example lacking, still this is a nice first cut.

Note: there is a kind of mathematical easter egg in this post. Can you find it?