functions and operators for processing and creating JSON data
the SQL/JSON path language
To learn more about the SQL/JSON standard, see [sqltr-19075-6]. For details on JSON types supported in PostgreSQL, see Section 8.14.
9.16.1. Processing and Creating JSON Data
Table 9.45 shows the operators that are available for use with JSON data types (see Section 8.14). In addition, the usual comparison operators shown in Table 9.1 are available for jsonb, though not for json. The comparison operators follow the ordering rules for B-tree operations outlined in Section 8.14.4. See also Section 9.21 for the aggregate function json_agg which aggregates record values as JSON, the aggregate function json_object_agg which aggregates pairs of values into a JSON object, and their jsonb equivalents, jsonb_agg and jsonb_object_agg.
Table 9.45. json and jsonb Operators
Operator
Description
Example(s)
json->integer → json
jsonb->integer → jsonb
Extracts n'th element of JSON array (array elements are indexed from zero, but negative integers count from the end).
Extracts JSON sub-object at the specified path as text.
'{"a": {"b": ["foo","bar"]}}'::json #>> '{a,b,1}' → bar
Note
The field/element/path extraction operators return NULL, rather than failing, if the JSON input does not have the right structure to match the request; for example if no such key or array element exists.
Some further operators exist only for jsonb, as shown in Table 9.46. Section 8.14.4 describes how these operators can be used to effectively search indexed jsonb data.
Table 9.46. Additional jsonb Operators
Operator
Description
Example(s)
jsonb@>jsonb → boolean
Does the first JSON value contain the second? (See Section 8.14.3 for details about containment.)
'{"a":1, "b":2}'::jsonb @> '{"b":2}'::jsonb → t
jsonb<@jsonb → boolean
Is the first JSON value contained in the second?
'{"b":2}'::jsonb <@ '{"a":1, "b":2}'::jsonb → t
jsonb?text → boolean
Does the text string exist as a top-level key or array element within the JSON value?
'{"a":1, "b":2}'::jsonb ? 'b' → t
'["a", "b", "c"]'::jsonb ? 'b' → t
jsonb?|text[] → boolean
Do any of the strings in the text array exist as top-level keys or array elements?
'{"a":1, "b":2, "c":3}'::jsonb ?| array['b', 'd'] → t
jsonb?&text[] → boolean
Do all of the strings in the text array exist as top-level keys or array elements?
'["a", "b", "c"]'::jsonb ?& array['a', 'b'] → t
jsonb||jsonb → jsonb
Concatenates two jsonb values. Concatenating two arrays generates an array containing all the elements of each input. Concatenating two objects generates an object containing the union of their keys, taking the second object's value when there are duplicate keys. All other cases are treated by converting a non-array input into a single-element array, and then proceeding as for two arrays. Does not operate recursively: only the top-level array or object structure is merged.
Deletes the array element with specified index (negative integers count from the end). Throws an error if JSON value is not an array.
'["a", "b"]'::jsonb - 1 → ["a"]
jsonb#-text[] → jsonb
Deletes the field or array element at the specified path, where path elements can be either field keys or array indexes.
'["a", {"b":1}]'::jsonb #- '{1,b}' → ["a", {}]
jsonb@?jsonpath → boolean
Does JSON path return any item for the specified JSON value?
'{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] ? (@ > 2)' → t
jsonb@@jsonpath → boolean
Returns the result of a JSON path predicate check for the specified JSON value. Only the first item of the result is taken into account. If the result is not Boolean, then NULL is returned.
'{"a":[1,2,3,4,5]}'::jsonb @@ '$.a[*] > 2' → t
Note
The jsonpath operators @? and @@ suppress the following errors: missing object field or array element, unexpected JSON item type, datetime and numeric errors. The jsonpath-related functions described below can also be told to suppress these types of errors. This behavior might be helpful when searching JSON document collections of varying structure.
Table 9.47 shows the functions that are available for constructing json and jsonb values.
Table 9.47. JSON Creation Functions
Function
Description
Example(s)
to_json ( anyelement ) → json
to_jsonb ( anyelement ) → jsonb
Converts any SQL value to json or jsonb. Arrays and composites are converted recursively to arrays and objects (multidimensional arrays become arrays of arrays in JSON). Otherwise, if there is a cast from the SQL data type to json, the cast function will be used to perform the conversion;[a] otherwise, a scalar JSON value is produced. For any scalar other than a number, a Boolean, or a null value, the text representation will be used, with escaping as necessary to make it a valid JSON string value.
to_json('Fred said "Hi."'::text) → "Fred said \"Hi.\""
to_jsonb(row(42, 'Fred said "Hi."'::text)) → {"f1": 42, "f2": "Fred said \"Hi.\""}
array_to_json ( anyarray [, boolean ] ) → json
Converts an SQL array to a JSON array. The behavior is the same as to_json except that line feeds will be added between top-level array elements if the optional boolean parameter is true.
Converts an SQL composite value to a JSON object. The behavior is the same as to_json except that line feeds will be added between top-level elements if the optional boolean parameter is true.
row_to_json(row(1,'foo')) → {"f1":1,"f2":"foo"}
json_build_array ( VARIADIC"any" ) → json
jsonb_build_array ( VARIADIC"any" ) → jsonb
Builds a possibly-heterogeneously-typed JSON array out of a variadic argument list. Each argument is converted as per to_json or to_jsonb.
Builds a JSON object out of a variadic argument list. By convention, the argument list consists of alternating keys and values. Key arguments are coerced to text; value arguments are converted as per to_json or to_jsonb.
Builds a JSON object out of a text array. The array must have either exactly one dimension with an even number of members, in which case they are taken as alternating key/value pairs, or two dimensions such that each inner array has exactly two elements, which are taken as a key/value pair. All values are converted to JSON strings.
[a] For example, the hstore extension has a cast from hstore to json, so that hstore values converted via the JSON creation functions will be represented as JSON objects, not as primitive string values.
Table 9.48 shows the functions that are available for processing json and jsonb values.
Table 9.48. JSON Processing Functions
Function
Description
Example(s)
json_array_elements ( json ) → setof json
jsonb_array_elements ( jsonb ) → setof jsonb
Expands the top-level JSON array into a set of JSON values.
select * from json_array_elements('[1,true, [2,false]]') →
value
-----------
1
true
[2,false]
json_array_elements_text ( json ) → setof text
jsonb_array_elements_text ( jsonb ) → setof text
Expands the top-level JSON array into a set of text values.
select * from json_array_elements_text('["foo", "bar"]') →
value
-----------
foo
bar
json_array_length ( json ) → integer
jsonb_array_length ( jsonb ) → integer
Returns the number of elements in the top-level JSON array.
Extracts JSON sub-object at the specified path. (This is functionally equivalent to the #> operator, but writing the path out as a variadic list can be more convenient in some cases.)
Expands the top-level JSON object to a row having the composite type of the base argument. The JSON object is scanned for fields whose names match column names of the output row type, and their values are inserted into those columns of the output. (Fields that do not correspond to any output column name are ignored.) In typical use, the value of base is just NULL, which means that any output columns that do not match any object field will be filled with nulls. However, if base isn't NULL then the values it contains will be used for unmatched columns.
To convert a JSON value to the SQL type of an output column, the following rules are applied in sequence:
A JSON null value is converted to an SQL null in all cases.
If the output column is of type json or jsonb, the JSON value is just reproduced exactly.
If the output column is a composite (row) type, and the JSON value is a JSON object, the fields of the object are converted to columns of the output row type by recursive application of these rules.
Likewise, if the output column is an array type and the JSON value is a JSON array, the elements of the JSON array are converted to elements of the output array by recursive application of these rules.
Otherwise, if the JSON value is a string, the contents of the string are fed to the input conversion function for the column's data type.
Otherwise, the ordinary text representation of the JSON value is fed to the input conversion function for the column's data type.
While the example below uses a constant JSON value, typical use would be to reference a json or jsonb column laterally from another table in the query's FROM clause. Writing json_populate_record in the FROM clause is good practice, since all of the extracted columns are available for use without duplicate function calls.
create type subrowtype as (d int, e text);create type myrowtype as (a int, b text[], c subrowtype);
select * from json_populate_record(null::myrowtype, '{"a": 1, "b": ["2", "a b"], "c": {"d": 4, "e": "a b c"}, "x": "foo"}') →
a | b | c
---+-----------+-------------
1 | {2,"a b"} | (4,"a b c")
Expands the top-level JSON array of objects to a set of rows having the composite type of the base argument. Each element of the JSON array is processed as described above for json[b]_populate_record.
create type twoints as (a int, b int);
select * from json_populate_recordset(null::twoints, '[{"a":1,"b":2}, {"a":3,"b":4}]') →
a | b
---+---
1 | 2
3 | 4
json_to_record ( json ) → record
jsonb_to_record ( jsonb ) → record
Expands the top-level JSON object to a row having the composite type defined by an AS clause. (As with all functions returning record, the calling query must explicitly define the structure of the record with an AS clause.) The output record is filled from fields of the JSON object, in the same way as described above for json[b]_populate_record. Since there is no input record value, unmatched columns are always filled with nulls.
create type myrowtype as (a int, b text);
select * from json_to_record('{"a":1,"b":[1,2,3],"c":[1,2,3],"e":"bar","r": {"a": 123, "b": "a b c"}}') as x(a int, b text, c int[], d text, r myrowtype) →
a | b | c | d | r
---+---------+---------+---+---------------
1 | [1,2,3] | {1,2,3} | | (123,"a b c")
json_to_recordset ( json ) → setof record
jsonb_to_recordset ( jsonb ) → setof record
Expands the top-level JSON array of objects to a set of rows having the composite type defined by an AS clause. (As with all functions returning record, the calling query must explicitly define the structure of the record with an AS clause.) Each element of the JSON array is processed as described above for json[b]_populate_record.
select * from json_to_recordset('[{"a":1,"b":"foo"}, {"a":"2","c":"bar"}]') as x(a int, b text) →
Returns target with the item designated by path replaced by new_value, or with new_value added if create_if_missing is true (which is the default) and the item designated by path does not exist. All earlier steps in the path must exist, or the target is returned unchanged. As with the path oriented operators, negative integers that appear in the path count from the end of JSON arrays. If the last path step is an array index that is out of range, and create_if_missing is true, the new value is added at the beginning of the array if the index is negative, or at the end of the array if it is positive.
If new_value is not NULL, behaves identically to jsonb_set. Otherwise behaves according to the value of null_value_treatment which must be one of 'raise_exception', 'use_json_null', 'delete_key', or 'return_target'. The default is 'use_json_null'.
Returns target with new_value inserted. If the item designated by the path is an array element, new_value will be inserted before that item if insert_after is false (which is the default), or after it if insert_after is true. If the item designated by the path is an object field, new_value will be inserted only if the object does not already contain that key. All earlier steps in the path must exist, or the target is returned unchanged. As with the path oriented operators, negative integers that appear in the path count from the end of JSON arrays. If the last path step is an array index that is out of range, the new value is added at the beginning of the array if the index is negative, or at the end of the array if it is positive.
Checks whether the JSON path returns any item for the specified JSON value. If the vars argument is specified, it must be a JSON object, and its fields provide named values to be substituted into the jsonpath expression. If the silent argument is specified and is true, the function suppresses the same errors as the @? and @@ operators do.
Returns the result of a JSON path predicate check for the specified JSON value. Only the first item of the result is taken into account. If the result is not Boolean, then NULL is returned. The optional vars and silent arguments act the same as for jsonb_path_exists.
Returns all JSON items returned by the JSON path for the specified JSON value. The optional vars and silent arguments act the same as for jsonb_path_exists.
Returns all JSON items returned by the JSON path for the specified JSON value, as a JSON array. The optional vars and silent arguments act the same as for jsonb_path_exists.
Returns the first JSON item returned by the JSON path for the specified JSON value. Returns NULL if there are no results. The optional vars and silent arguments act the same as for jsonb_path_exists.
These functions act like their counterparts described above without the _tz suffix, except that these functions support comparisons of date/time values that require timezone-aware conversions. The example below requires interpretation of the date-only value 2015-08-02 as a timestamp with time zone, so the result depends on the current TimeZone setting. Due to this dependency, these functions are marked as stable, which means these functions cannot be used in indexes. Their counterparts are immutable, and so can be used in indexes; but they will throw errors if asked to make such comparisons.
jsonb_path_exists_tz('["2015-08-01 12:00:00-05"]', '$[*] ? (@.datetime() < "2015-08-02".datetime())') → t
jsonb_pretty ( jsonb ) → text
Converts the given JSON value to pretty-printed, indented text.
jsonb_pretty('[{"f1":1,"f2":null}, 2]') →
[
{
"f1": 1,
"f2": null
},
2
]
json_typeof ( json ) → text
jsonb_typeof ( jsonb ) → text
Returns the type of the top-level JSON value as a text string. Possible types are object, array, string, number, boolean, and null. (The null result should not be confused with an SQL NULL; see the examples.)
json_typeof('-123.4') → number
json_typeof('null'::json) → null
json_typeof(NULL::json) IS NULL → t
9.16.2. The SQL/JSON Path Language
SQL/JSON path expressions specify the items to be retrieved from the JSON data, similar to XPath expressions used for SQL access to XML. In PostgreSQL, path expressions are implemented as the jsonpath data type and can use any elements described in Section 8.14.7.
JSON query functions and operators pass the provided path expression to the path engine for evaluation. If the expression matches the queried JSON data, the corresponding JSON item, or set of items, is returned. Path expressions are written in the SQL/JSON path language and can include arithmetic expressions and functions.
A path expression consists of a sequence of elements allowed by the jsonpath data type. The path expression is normally evaluated from left to right, but you can use parentheses to change the order of operations. If the evaluation is successful, a sequence of JSON items is produced, and the evaluation result is returned to the JSON query function that completes the specified computation.
To refer to the JSON value being queried (the context item), use the $ variable in the path expression. It can be followed by one or more accessor operators, which go down the JSON structure level by level to retrieve sub-items of the context item. Each operator that follows deals with the result of the previous evaluation step.
For example, suppose you have some JSON data from a GPS tracker that you would like to parse, such as:
To retrieve the available track segments, you need to use the .key accessor operator to descend through surrounding JSON objects:
$.track.segments
To retrieve the contents of an array, you typically use the [*] operator. For example, the following path will return the location coordinates for all the available track segments:
$.track.segments[*].location
To return the coordinates of the first segment only, you can specify the corresponding subscript in the [] accessor operator. Recall that JSON array indexes are 0-relative:
$.track.segments[0].location
The result of each path evaluation step can be processed by one or more jsonpath operators and methods listed in Section 9.16.2.2. Each method name must be preceded by a dot. For example, you can get the size of an array:
$.track.segments.size()
More examples of using jsonpath operators and methods within path expressions appear below in Section 9.16.2.2.
When defining a path, you can also use one or more filter expressions that work similarly to the WHERE clause in SQL. A filter expression begins with a question mark and provides a condition in parentheses:
? (condition)
Filter expressions must be written just after the path evaluation step to which they should apply. The result of that step is filtered to include only those items that satisfy the provided condition. SQL/JSON defines three-valued logic, so the condition can be true, false, or unknown. The unknown value plays the same role as SQL NULL and can be tested for with the is unknown predicate. Further path evaluation steps use only those items for which the filter expression returned true.
The functions and operators that can be used in filter expressions are listed in Table 9.50. Within a filter expression, the @ variable denotes the value being filtered (i.e., one result of the preceding path step). You can write accessor operators after @ to retrieve component items.
For example, suppose you would like to retrieve all heart rate values higher than 130. You can achieve this using the following expression:
$.track.segments[*].HR ? (@ > 130)
To get the start times of segments with such values, you have to filter out irrelevant segments before returning the start times, so the filter expression is applied to the previous step, and the path used in the condition is different:
$.track.segments[*] ? (@.HR > 130)."start time"
You can use several filter expressions in sequence, if required. For example, the following expression selects start times of all segments that contain locations with relevant coordinates and high heart rate values:
Using filter expressions at different nesting levels is also allowed. The following example first filters all segments by location, and then returns high heart rate values for these segments, if available:
This expression returns the size of the track if it contains any segments with high heart rate values, or an empty sequence otherwise.
PostgreSQL's implementation of the SQL/JSON path language has the following deviations from the SQL/JSON standard:
A path expression can be a Boolean predicate, although the SQL/JSON standard allows predicates only in filters. This is necessary for implementation of the @@ operator. For example, the following jsonpath expression is valid in PostgreSQL:
$.track.segments[*].HR < 70
There are minor differences in the interpretation of regular expression patterns used in like_regex filters, as described in Section 9.16.2.3.
9.16.2.1. Strict And Lax Modes
When you query JSON data, the path expression may not match the actual JSON data structure. An attempt to access a non-existent member of an object or element of an array results in a structural error. SQL/JSON path expressions have two modes of handling structural errors:
lax (default) — the path engine implicitly adapts the queried data to the specified path. Any remaining structural errors are suppressed and converted to empty SQL/JSON sequences.
strict — if a structural error occurs, an error is raised.
The lax mode facilitates matching of a JSON document structure and path expression if the JSON data does not conform to the expected schema. If an operand does not match the requirements of a particular operation, it can be automatically wrapped as an SQL/JSON array or unwrapped by converting its elements into an SQL/JSON sequence before performing this operation. Besides, comparison operators automatically unwrap their operands in the lax mode, so you can compare SQL/JSON arrays out-of-the-box. An array of size 1 is considered equal to its sole element. Automatic unwrapping is not performed only when:
The path expression contains type() or size() methods that return the type and the number of elements in the array, respectively.
The queried JSON data contain nested arrays. In this case, only the outermost array is unwrapped, while all the inner arrays remain unchanged. Thus, implicit unwrapping can only go one level down within each path evaluation step.
For example, when querying the GPS data listed above, you can abstract from the fact that it stores an array of segments when using the lax mode:
lax $.track.segments.location
In the strict mode, the specified path must exactly match the structure of the queried JSON document to return an SQL/JSON item, so using this path expression will cause an error. To get the same result as in the lax mode, you have to explicitly unwrap the segments array:
strict $.track.segments[*].location
The .** accessor can lead to surprising results when using the lax mode. For instance, the following query selects every HR value twice:
lax $.**.HR
This happens because the .** accessor selects both the segments array and each of its elements, while the .HR accessor automatically unwraps arrays when using the lax mode. To avoid surprising results, we recommend using the .** accessor only in the strict mode. The following query selects each HR value just once:
strict $.**.HR
9.16.2.2. SQL/JSON Path Operators And Methods
Table 9.49 shows the operators and methods available in jsonpath. Note that while the unary operators and methods can be applied to multiple values resulting from a preceding path step, the binary operators (addition etc.) can only be applied to single values.
Table 9.49. jsonpath Operators and Methods
Operator/Method
Description
Example(s)
number+number → number
Addition
jsonb_path_query('[2]', '$[0] + 3') → 5
+number → number
Unary plus (no operation); unlike addition, this can iterate over multiple values
The object's key-value pairs, represented as an array of objects containing three fields: "key", "value", and "id"; "id" is a unique identifier of the object the key-value pair belongs to
The result type of the datetime() and datetime(template) methods can be date, timetz, time, timestamptz, or timestamp. Both methods determine their result type dynamically.
The datetime() method sequentially tries to match its input string to the ISO formats for date, timetz, time, timestamptz, and timestamp. It stops on the first matching format and emits the corresponding data type.
The datetime(template) method determines the result type according to the fields used in the provided template string.
The datetime() and datetime(template) methods use the same parsing rules as the to_timestamp SQL function does (see Section 9.8), with three exceptions. First, these methods don't allow unmatched template patterns. Second, only the following separators are allowed in the template string: minus sign, period, solidus (slash), comma, apostrophe, semicolon, colon and space. Third, separators in the template string must exactly match the input string.
If different date/time types need to be compared, an implicit cast is applied. A date value can be cast to timestamp or timestamptz, timestamp can be cast to timestamptz, and time to timetz. However, all but the first of these conversions depend on the current TimeZone setting, and thus can only be performed within timezone-aware jsonpath functions.
Table 9.50 shows the available filter expression elements.