9.14. XML函式
版本:11
本節中描述的函數和類函數表示式對 xml 型別的值進行操作。有關 xml 型別的訊息,請查看第 8.13 節。這裡不再重複用於轉換為 xml 型別的函數表示式 xmlparse 和 xmlserialize。使用大多數這些函數需要使用 configure --with-libxml 編譯安裝。
9.14.1. 産生 XML 內容
一組函數和類函數的表示式可用於從 SQL 資料産生 XML 內容。因此,它們特別適合將查詢結果格式化為 XML 文件以便在用戶端應用程序中進行處理。
9.14.1.1. xmlcomment
函數 xmlcomment 建立一個 XML 字串,其中包含指定文字作為內容的 XML 註釋。文字不能包含「 -- 」或以「 - 」結尾,以便産生的結構是有效的 XML 註釋。 如果參數為 null,則結果為 null。
例如:
9.14.1.2. xmlconcat
函數 xmlconcat 連接列表中各個 XML 字串,以建立包含 XML 內容片段的單個字串。空值會被忽略;如果都沒有非空值參數,則結果僅為 null。
例如:
XML 宣告(如果存在)組合如下。如果所有參數值具有相同的 XML 版本宣告,則在結果中使用該版本,否則不使用任何版本。如果所有參數值都具有獨立宣告值「yes」,則在結果中使用該值。如果所有參數值都具有獨立的宣告值且至少有一個為「no」,則在結果中使用該值。否則結果將沒有獨立宣告。如果確定結果需要獨立宣告但沒有版本聲明,則將使用版本為 1.0 的版本宣告,因為 XML 要求 XML 宣告包含版本宣告。在所有情況下都會忽略編碼宣告並將其刪除。
例如:
9.14.1.3. xmlelement
xmlelement 表示式産生具有給定名稱、屬性和內容的 XML 元素。
範例:
透過用 xHHHH 序列替換有問題的字符來轉譯非有效 XML 名稱的元素和屬性名稱,其中 HHHH 是十六進位表示法中字元的 Unicode 代碼。例如:
如果屬性值是引用欄位,則無需明確指定屬性名稱,在這種情況下,預設情況下欄位的名稱將用作屬性名稱。在其他情況下,必須為該屬性明確指定名稱。所以這個例子是有效的:
但這些不行:
元素內容(如果已指定)將根據其資料型別進行格式化。如果內容本身是 xml 型別,則可以建構複雜的 XML 文件。例如:
其他型別的內容將被格式化為有效的 XML 字元資料。這尤其意味著字符 <、> 和 & 將被轉換為其他形式。二進位資料(資料型別 bytea)將以 base64 或十六進位編碼表示,具體取決於組態參數 xmlbinary 的設定。為了使 SQL 和 PostgreSQL 資料型別與 XML Schema 規範保持一致,預計各種資料型別的特定行為將會各自發展,此時將出現更精確的描述。
9.14.1.4. xmlforest
xmlforest 表示式使用給定的名稱和內容産生元素的 XML 序列。
範例:
如第二個範例所示,如果內容值是欄位引用,則可以省略元素名稱,在這種情況下,預設情況下使用欄位名稱。 否則,必須指定名稱。
非有效的 XML 名稱的元素名稱將被轉譯,如上面的 xmlelement 所示。類似地,內容資料會被轉譯以産生有效的 XML 內容,除非它已經是 xml 型別。
請注意,如果 XML 序列由多個元素組成,則它們不是有效的 XML 文件,因此將 xmlforest 表示式包裝在 xmlelement 中可能很有用。
9.14.1.5. xmlpi
xmlpi 表示式建立 XML 處理指令。內容(如果存在)不得包含字元序列 ?>。
例如:
9.14.1.6. xmlroot
xmlroot 表示式改變 XML 值的根節點屬性。如果指定了版本,它將替換根節點的版本宣告中的值;如果指定了獨立設定,則它將替換根節點的獨立宣告中的值。
9.14.1.7. xmlagg
與此處描述的其他函數不同,函數 xmlagg 是一個彙總函數。它將輸入值連接到彙總函數呼叫,就像 xmlconcat 一樣,除了它是跨資料列而不是在單個資料列中的表示式進行連接。有關彙總函數的其他訊息,請參閱第 9.20 節。
例如:
要確定連接的順序,可以將 ORDER BY 子句加到彙總呼叫中,如第 4.2.7 節中所述。例如:
以前的版本中推薦使用以下非標準方法,在特定情況下可能仍然有用:
9.14.2. XML Predicates
本節中描述的表示式用於檢查 xml 的屬性。
9.14.2.1. IS DOCUMENT
如果參數 XML 是正確的 XML 文件,則表示式 IS DOCUMENT 將回傳 true,如果不是(它是內容片段),則回傳 false;如果參數為 null,則回傳 null。有關文件和內容片段之間的區別,請參閱第 8.13 節。
9.14.2.2. XMLEXISTS
如果第一個參數中的 XPath 表示式回傳任何節點,則 xmlexists 函數回傳 true,否則回傳 false。 (如果任一參數為 null,則結果為 null。)
範例
BY REF 子句在 PostgreSQL 中沒有任何作用,但可以達到 SQL 一致性和與其他實作的相容性。根據 SQL 標準,第一個 BY REF 是必需的,第二個是選擇性的。另請注意,SQL 標準指定 xmlexists 構造將 XQuery 表示式作為第一個參數,但 PostgreSQL 目前僅支持 XPath,它是 XQuery 的子集。
9.14.2.3. xml_is_well_formed
此函數檢查文字字串是否格式正確,回傳布林結果。xml_is_well_formed_document 檢查格式正確的文檔,而 xml_is_well_formed_content 檢查格式良好的內容。如果 xmloption 配置參數設定為 DOCUMENT,則 xml_is_well_formed 會執行前者;如果設定為 CONTENT,則執行後者。這意味著 xml_is_well_formed 對於查看對 xml 類型的簡單強制轉換是否成功很有用,而其他兩個函數對於查看 XMLPARSE 的相對應變數是否成功很有用。
範例:
最後一個範例顯示檢查包括命名空間是否符合。
9.14.3. 處理 XML
為了處理資料型別為 xml 的值,PostgreSQL 提供了 xpath 和 xpath_exists 函數,它們用於計算 XPath 1.0 表示式和 XMLTABLE 資料表函數。
9.14.3.1. xpath
函數 xpath 根據 XML 值 xml 計算 XPath 表示式 xpath(字串)。 它回傳與 XPath 表示式產生的節點集合所相對應 XML 值的陣列。如果 XPath 表示式回傳單一變數值而不是節點集合,則回傳單個元素的陣列。
第二個參數必須是格式良好的 XML 內容。特別要注意是,它必須具有單一根節點元素。
該函數的選擇性第三個參數是命名空間對應的陣列。該陣列應該是二維字串陣列,第二維的長度等於 2(即,它應該是陣列的陣列,每個陣列恰好由 2 個元素組成)。每個陣列項目的第一個元素是命名空間名稱(別名),第二個是命名空間 URI。不要求此陣列中提供的別名與 XML 內容本身所使用的別名相同(換句話說,在 XML 內容和 xpath 函數內容中,別名都是區域性的)。
例如:
要設定預設的(匿名)命名空間,請執行以下操作:
9.14.3.2. xpath_exists
The function xpath_exists
is a specialized form of the xpath
function. Instead of returning the individual XML values that satisfy the XPath, this function returns a Boolean indicating whether the query was satisfied or not. This function is equivalent to the standard XMLEXISTS
predicate, except that it also offers support for a namespace mapping argument.
Example:
9.14.3.3. xmltable
The xmltable
function produces a table based on the given XML value, an XPath filter to extract rows, and an optional set of column definitions.
The optional XMLNAMESPACES
clause is a comma-separated list of namespaces. It specifies the XML namespaces used in the document and their aliases. A default namespace specification is not currently supported.
The required row_expression
argument is an XPath expression that is evaluated against the supplied XML document to obtain an ordered sequence of XML nodes. This sequence is what xmltable
transforms into output rows.
document_expression
provides the XML document to operate on. The BY REF
clauses have no effect in PostgreSQL, but are allowed for SQL conformance and compatibility with other implementations. The argument must be a well-formed XML document; fragments/forests are not accepted.
The mandatory COLUMNS
clause specifies the list of columns in the output table. If the COLUMNS
clause is omitted, the rows in the result set contain a single column of type xml
containing the data matched by row_expression
. If COLUMNS
is specified, each entry describes a single column. See the syntax summary above for the format. The column name and type are required; the path, default and nullability clauses are optional.
A column marked FOR ORDINALITY
will be populated with row numbers matching the order in which the output rows appeared in the original input XML document. At most one column may be marked FOR ORDINALITY
.
The column_expression
for a column is an XPath expression that is evaluated for each row, relative to the result of the row_expression
, to find the value of the column. If no column_expression
is given, then the column name is used as an implicit path.
If a column's XPath expression returns multiple elements, an error is raised. If the expression matches an empty tag, the result is an empty string (not NULL
). Any xsi:nil
attributes are ignored.
The text body of the XML matched by the column_expression
is used as the column value. Multiple text()
nodes within an element are concatenated in order. Any child elements, processing instructions, and comments are ignored, but the text contents of child elements are concatenated to the result. Note that the whitespace-only text()
node between two non-text elements is preserved, and that leading whitespace on a text()
node is not flattened.
If the path expression does not match for a given row but default_expression
is specified, the value resulting from evaluating that expression is used. If no DEFAULT
clause is given for the column, the field will be set to NULL
. It is possible for a default_expression
to reference the value of output columns that appear prior to it in the column list, so the default of one column may be based on the value of another column.
Columns may be marked NOT NULL
. If the column_expression
for a NOT NULL
column does not match anything and there is no DEFAULT
or the default_expression
also evaluates to null, an error is reported.
Unlike regular PostgreSQL functions, column_expression
and default_expression
are not evaluated to a simple value before calling the function. column_expression
is normally evaluated exactly once per input row, and default_expression
is evaluated each time a default is needed for a field. If the expression qualifies as stable or immutable the repeat evaluation may be skipped. Effectively xmltable
behaves more like a subquery than a function call. This means that you can usefully use volatile functions like nextval
in default_expression
, and column_expression
may depend on other parts of the XML document.
Examples:
The following example shows concatenation of multiple text() nodes, usage of the column name as XPath filter, and the treatment of whitespace, XML comments and processing instructions:
The following example illustrates how the XMLNAMESPACES
clause can be used to specify the default namespace, and a list of additional namespaces used in the XML document as well as in the XPath expressions:
9.14.4. Mapping Tables to XML
The following functions map the contents of relational tables to XML values. They can be thought of as XML export functionality:
The return type of each function is xml
.
table_to_xml
maps the content of the named table, passed as parameter tbl
. The regclass
type accepts strings identifying tables using the usual notation, including optional schema qualifications and double quotes. query_to_xml
executes the query whose text is passed as parameter query
and maps the result set. cursor_to_xml
fetches the indicated number of rows from the cursor specified by the parameter cursor
. This variant is recommended if large tables have to be mapped, because the result value is built up in memory by each function.
If tableforest
is false, then the resulting XML document looks like this:
If tableforest
is true, the result is an XML content fragment that looks like this:
If no table name is available, that is, when mapping a query or a cursor, the string table
is used in the first format, row
in the second format.
The choice between these formats is up to the user. The first format is a proper XML document, which will be important in many applications. The second format tends to be more useful in the cursor_to_xml
function if the result values are to be reassembled into one document later on. The functions for producing XML content discussed above, in particular xmlelement
, can be used to alter the results to taste.
The data values are mapped in the same way as described for the function xmlelement
above.
The parameter nulls
determines whether null values should be included in the output. If true, null values in columns are represented as:
where xsi
is the XML namespace prefix for XML Schema Instance. An appropriate namespace declaration will be added to the result value. If false, columns containing null values are simply omitted from the output.
The parameter targetns
specifies the desired XML namespace of the result. If no particular namespace is wanted, an empty string should be passed.
The following functions return XML Schema documents describing the mappings performed by the corresponding functions above:
It is essential that the same parameters are passed in order to obtain matching XML data mappings and XML Schema documents.
The following functions produce XML data mappings and the corresponding XML Schema in one document (or forest), linked together. They can be useful where self-contained and self-describing results are wanted:
In addition, the following functions are available to produce analogous mappings of entire schemas or the entire current database:
Note that these potentially produce a lot of data, which needs to be built up in memory. When requesting content mappings of large schemas or databases, it might be worthwhile to consider mapping the tables separately instead, possibly even through a cursor.
The result of a schema content mapping looks like this:
where the format of a table mapping depends on the tableforest
parameter as explained above.
The result of a database content mapping looks like this:
where the schema mapping is as above.
As an example of using the output produced by these functions, Figure 9.1 shows an XSLT stylesheet that converts the output of table_to_xml_and_xmlschema
to an HTML document containing a tabular rendition of the table data. In a similar manner, the results from these functions can be converted into other XML-based formats.
Figure 9.1. XSLT Stylesheet for Converting SQL/XML Output to HTML
Last updated