1 of 100

15 簡介

本使用手冊由台灣 PostgreSQL 社群提供，翻譯自 PostgreSQL 官方使用手冊，以推廣 PostgreSQL 於台灣的應用。

本使用手冊由台灣 PostgreSQL 使用者社群提供，編譯自 PostgreSQL 官方使用手冊，以推廣 PostgreSQL 於台灣的應用。

本使用手冊目前編譯內容為 PostgreSQL 15。

每一個頁面均附上對應連結，翻譯未詳盡之處，可對照閱讀。未翻譯完成之段落，將暫以原文（英文）替代。

此手冊為自由參與的開源專案，歡迎任何夥伴參與協作！每一個頁面右上角均可點選「Edit on GitHub」，修改後直接送出 PR 即可。（只翻一句也可以唷！）

下載及安裝指引，請到，依您的環境選擇操作步驟。

任何問題或建議可以 Email 給我們的文件小組：

前言

本手冊是 PostgreSQL 的官方手冊。由 PostgreSQL 開發人員和其他志願者在 PostgreSQL 軟體開發的同時所撰寫的。它描述了目前 PostgreSQL 版本正式支援的所有功能。

為了使有關 PostgreSQL 的大量資訊易於管理，本書劃分為幾個部分。每個部分針對的是不同需求的使用者，或針對處於 PostgreSQL 經驗不同階段的使用者：

第一部分是對新使用者的入門簡介。
第二部分將介紹 SQL 查詢語言環境，包括資料型別、函數以及使用者層級的效能調教。每個 PostgreSQL 使用者都應該閱讀此部份的內容。
則介紹伺服器的安裝及管理。任何維運 PostgreSQL 伺服器的人，無論是供私人使用還是提供給其他人使用，都應該閱讀此部分。
描述了 PostgreSQL 用戶端的程式設計介面。
為進階使用者提供有關資料庫服務進階功能的資訊。主題包括使用者定義的資料型別與函數。
包含有關 SQL 命令、用戶端和伺服器程式的參考資訊。這部分以命令或程序分類結構化資訊。
包含了對 PostgreSQL 開發人員可能有用的各種資訊。

1. 什麼是 PostgreSQL？

PostgreSQL 是美國加州伯克萊大學資訊科學系基於 POSTGRES 4.2 所研發的物件關聯式資料庫管理系統（ORDBMS, Object-Relational Database Management System）。POSTGRES 中的許多重要概念成為日後一些商用資料庫系統重要的一部份。

PostgreSQL 由伯克萊大學公開其原始碼所誕生，它支援了大多數的標準 SQL 語法，並提供許多先進的功能：

複雜查詢（complex queries）
外部索引鍵（foreign keys）
觸發器（triggers）
可更新檢查表（updatable views）
事務完整性（transactional integrity）
多版本併行控制（multiversion concurrency control）

同時，PostgreSQL 也支援讓使用者能以自己的方式進行擴充。比如透過新增：

資料型別（data types）
函數（functions）
操作（operators）
聚合函數（aggregate functions）

並且基於自由許可證，任何人都能夠以任何目的，免費地使用、修改、與散布 PostgreSQL，不論是個人使用、商業用途還是學術研究。

2. PostgreSQL 沿革

現在被稱為 PostgreSQL 的物件關聯式資料庫管理系統，是根據美國加州伯克萊大學所研發的 POSTGRES 衍生而成。經過超過二十年以上的演進，PostgreSQL 現在是世界上最先進的開源資料庫系統。

2.1. 伯克萊大學 POSTGRES 專案

POSTGRES 專案是由 Michael Stonebraker 教授領導的團隊進行研發，由美國國防高等研究計劃署（DARPA, Defense Advanced Research Projects Agency）、美國陸軍研究辦公室（ARO, the Army Research Office）、美國國家科學基金會（NSF, the National Science Foundation）及美國電磁系統實驗室（ESL, Inc）所贊助。POSTGRES 專案始於 1986 年，最原始的設計，＂The design of POSTGRES＂，作為開端，其最初的資料結構模型則揭露於＂The POSTGRES data model＂。規則系統設計發表於＂The design of the POSTGRES rules system＂，而當時的關連式資料儲存的架構則刊載於＂The design of the POSTGRES storage system＂。

POSTGRES 接下進行了幾次重大的變革。第一代的＂demoware＂在 1987 年真的實作成為可用的系統，並在 1988 年的 ACM-SIGMOD 研討會中進行展示，並在 1989 年六月，釋出了第一版可供外部使用者使用的資料庫系統。為了回應當時使用者對於第一代規則系統的批評，其規則系統重新進行設計，並在隔年 1990 年的六月份，隨即推出第二版系統，搭載新的規則系統設計。第三版系統則於 1991 年發表，新增支援多重儲存管理機制，改善查詢處理器，並又改寫了規則系統。如此直到 Postgres 95 誕生之前，主要都專注於移植性及可信賴度的發展。

3. 慣例

以下所提到慣例，用於指令的語法描述上（均為半型字元）：

中括號（[ 和 ]）指可選擇是否輸入的選項。（在 Tcl 指令的語法中，習慣使用問號 ? 來表達這樣的可選擇性）
大括號（{ 和 }）及垂直線（

4. 其他參考資訊

除了此份文件之外，PostgreSQL 還有其他的參考資源：

維基（Wiki）

PostgreSQL的 wiki 記錄了這個專案的常見問題與解答（FAQ），待辦事項（TODO），以及其他更多不同主題的資訊。

PostgreSQL wiki 也有台灣中文的頁面喔。

網站（Web Site）

PostgreSQL 的，有最新軟體的釋出訊息，讓你能夠和 PostgreSQL 相處得更棒！

郵件列表（Mailing Lists）

郵件列表的功能，是一個為您解答疑問的好地方，你也可以分享使用經驗給其他同好，或直接和開發者溝通。詳情請參閱 PostgreSQL 的官方網站。

你！（Yourself!）

PostgreSQL 是一個開源的專案，也就是說，它仰賴社群的每一個人給予支持。當你開始使用 PostgreSQL，你會需要其他人的幫助，可能是透過文件或是郵件列表的功能。請考慮也可以回饋您的知識。在閱讀郵件列表和回答疑問的同時，如果你學到了未被文件記載的知識時，請寫下來，並且供獻出來。如果你撰寫了一些程式碼增加了特別的功能，也希望能夠回饋到社群之中。

I. 新手教學

歡迎來到 PostgreSQL 的新手教學。在這個部份裡的內容，主要提供有關於 PostgreSQL 各項功能的簡介、關連式資料庫概念、以及 SQL 語法的入門說明。我們只假設您俱備一些電腦系統基本操作，並不需要很專業的 Unix 或程式設計經驗。這裡主要提供一些實用的經驗，還有 PostgreSQL 系統中重要部份的介紹。在這個部份並不會進行所有議題的詳細說明。

在你閱讀完新手教學之後，也許可以繼續閱讀第二部份：更多有關於 SQL 語法的標準知識；或者到第四部份：瞭解如何開發 PostgreSQL 的應用程式；而如果你需要建置及管理你的資料庫伺服器的話，請參閱第三部份的內容。

1. 入門指南

：從無到有，安裝一個 PostgreSQL 資料庫系統。
：認識 PostgreSQL 的資料庫架構。
：建立第一個 PostgreSQL 資料庫。

1.1. 安裝

版本：11

你需要先進行安裝，才能開始使用 PostgreSQL。當然，PostgreSQL 也可能已經被安裝在你的系統之中，因為你的作業系統預設套件包含了 PostgreSQL，或其他系統管理者已先行安裝。如果是這樣的話，那麼你應該先瞭解作業系統的資訊，或向你的系統管理員取得存取方式的資訊。

如果你並不確定 PostgreSQL 是否已經可以使用，那麼你也可以自行安裝試試。這樣做並不是很困難，而且是很好的操作練習。PostgreSQL 可以以一般使用者進行安裝，它並不需要系統管理者（root）的權限才能安裝。

如果你打算自行安裝 PostgreSQL，你可以參考的指令進行，完成之後再回到這裡，以瞭解下面有關設定環境變數的內容。

如果你的系統管理者並非以預設的方式安裝，你可能還有一些額外的工作要做。例如，如果資料庫主機其實是遠端的伺服器，你會需要設定 PGHOST 的環境變數，將其指向資料庫主機的網路名稱。而 PGPORT 變數也是必須要設定的。最基本的情境是，如果你嘗試啓動一個應用程式，而它回報它無法取得資料庫連線時，你就必須洽詢你的系統管理者。而如果系統管理者就是你自己，那麼你應該依文件再確認你的環境設定是正確的。如果你仍然並不清楚前面所描述的事項，請詳細閱讀下一節的內容。

1.2. 基礎架構

版本：11

在開始使用之前，你需要瞭解基本的 PostgreSQL 系統架構。瞭解 PostgreSQL 如何回應操作，有助於讓你更清楚理解接下來的說明。

以資料庫的術語來說，PostgreSQL 採用了主從式架構（client/server）。PostgreSQL 會在進行下列操作時保持連線：

伺服器的執行程序，負責管理資料庫的檔案、受理用戶端的連線要求、執行相對應的資料庫動作。這樣的資料庫伺服端程式稱之為「postgres」。
用戶端的程式用來發起資料庫操作的行為，其設計的形態很廣泛：可能是文字介面的工具、圖型介面的程式、將資料庫內容顯示成網頁的網際網路伺服器、甚或是專用的資料庫管理工具。有一些用戶端程式是由 PostgreSQL 官方所提供，大部份由第三方的其他使用者所開發。

如同一般的主從式架構，用戶端與伺服端可以是兩台不同的主機，而他們透過 TCP/IP 的網路協定溝通。你應該將這個觀念謹記在心，因為某些在用戶端可以被存取的檔案，在伺服端可能就無法存取（或使用不同的檔案名稱）。

1.3. 建立一個資料庫

第一個測試確認你是否能夠存取一個資料庫服務，就是嘗試去建立一個資料庫。一個執行中的 PostgreSQL 服務可以管理許多個資料庫。一般來說，每一個專案或使用者會分開使用不同的資料庫。

你的系統管理員也可能已經為你建立了一個資料庫，如果是這樣的話，那你可以略過本節說明，直接進入到下一節的內容。

要建立一個新的資料庫，在本例中取名叫「mydb」，你可以使用以下的命令：

$ createdb mydb

如果在這個步驟沒有產生任何回應，那就是成功了。你可以跳過本節剩餘的部份。

但你如果看到如下的訊息：

createdb: command not found

這個訊息代表 PostgreSQL 並沒有被正確的安裝。不是它沒有被安裝好，那就是你的命令路徑設定並未包含這個指令。嘗試使用下列這個包含絕對路徑的指令看看：

$ /usr/local/pgsql/bin/createdb mydb

命令路徑在你的系統可能會有些不同。洽詢你的系統管理員，或著檢查安裝步驟以修正這個情況。

另一種回應可能是如此：

這代表了資料庫服務尚未啓動，或者它並不存在於createdb預設連線的位置。同樣地，檢查安裝的步驟或洽詢系統管理者。

而另一種回應也可能是：

這裡指出你用來連線的使用者名稱。這種情況可能會發生在你的資料庫管理員並未建立屬於你的資料庫。（PostgreSQL 的使用者帳戶是獨立於作業系統的使用者帳戶的）如果你是資料庫管理員，請參閱，進行建立資料庫帳戶。你必須是 PostgreSQL 初始安裝的管理者（通常是 postgres），以建立第一個一般資料庫使用者的帳戶。這個情況也可能發生在，你被發配的 PostgreSQL 使用者名稱有別於你的作業系統使用者名稱，如果是這樣的話，那你需要在指令上使用 -U 選項，或者設定 PGUSER 環境變數，以指定你的 PostgreSQL 使用者名稱。

如果你有一個資料庫帳戶，但你並沒有建立資料庫的權限，你將會看到下列訊息：

並非每一個使用者都被授權可以建立一個新的資料庫。如果 PostgreSQL 拒絕你建立資料庫，那麼系統管理者就需要賦予你建立資料庫的權限。洽詢你的系統管理者，如果是這種情況的話。如果你是自行安裝 PostgreSQL，那麼你應該以你啓動資料庫服務的使用者登入作業系統，再嘗試這個操作。

你也可以建立資料庫，但使用其他的名稱。PostgreSQL 允許在資料庫系統中建立無限制數量的資料庫。資料庫名稱必須是以英文字母為開頭，總長度限制為 63 位元組。一個簡便的方式是，建立一個與你使用者名稱同名的資料庫。許多工具會預設假定資料庫名稱和你同名，所以這可以省略一些文字的輸入。要建立這樣的資料庫，只要簡單地輸入：

如果你不再使用你的資料庫，你可以移除它。舉例來說，你是 mydb 這個資料庫的擁有者（建立者），你可以使用下列指令來消毁它：

（對這個指令來說，資料庫名稱並不會預設使用你的使用者同名資料庫。你必須明確地指定名稱）這個動作會完全地移除所有和這個資料庫相關的檔案，並且沒有回復的可能，所以要進行這個動作的話，請一定要考慮清楚。

更多有關於 createdb 和 dropdb 的說明，請參閱和的相關章節。

1.4. 存取一個資料庫

一旦你已經建立一個資料庫，你就可以開始以下列方式進行存取：

執行 PostgreSQL 互動式的終端程式，稱作 psql，它可以讓你輸入、編輯、執行 SQL 指令。
使用既有的圖型化介面工具，例如 pgAdmin 或是支援 ODBC 或 JDBC 的辦公室軟體，以建立並輸入資料到資料庫裡。不過這部份並未包含在這份手冊之中。
自行撰寫一個程式，可以使用許多種程式語言來完成。這個部份將會在第 IV 章中進行介紹。

2. SQL 查詢語言

本章適合初學資料庫的朋友閱讀，以簡單的語法範例，實際操作以瞭解資料庫的運作方式。事實上，更複雜的資料庫行為，也不脫這個基本的操作模式。

2.1. 簡介

在這一章之中，提供了一個如何使用 SQL 進行簡易操作的大致概念。這裡主要讓你有基本的認識，但無法提供 SQL 完整且巨細靡遺的說明。許多書籍詳細介紹了 SQL，例如「Understanding the New SQL. A complete quide.」及「A Guide to the SQL Standard. A user's guid to the standard database language SQL.」。你應該瞭解的是，一些 PostgreSQL 語法來自於標準 SQL 的延伸。

在下面的例子當中，我們假設你已經建立了一個資料庫 mydb，如同前面章節所述，你也能夠使用 psql 了。

這些例子也放在 PostgreSQL 的原始碼之中，你可以在目錄 src/tutorial/ 下找到他們。（PostgreSQL的可執行套件可能未包含這些檔案）想要使用這些檔案的話，首先請切換到該目錄之下，然後執行 make：

這將會建立編譯 C 語言的程序，包含了使用者自訂的函式及型別。接下來，進行下列動作，以開始這個導覽：

\i 指令會去指定的檔案讀取內容，並且執行。而在 psql 的 -s 選項則可以使用單步模式執行，也就是在每一個與伺服器互動的指令之後暫停。這個指令被使用在本節的檔案 basics.sql 之中。

2.2. 概念

PostgreSQL 是一個關連式資料庫管理系統（RDBMS）。這表示它是一個管理關連性質資料的系統。關連性，基本上在數學裡是以資料表（table）的形式來表現的。今天，以資料表為形式儲存資料是很常見的事，它是很自然的表現，但也有很多其他組識資料庫的方式。在 Unix-like 的作業系統中，檔案和目錄是一個階層式資料庫的案例。更先進的發展是採用物件導向式的資料庫。

每一個資料表是很多資料列（row）的集合。而每一個資料列則以許多相同集合的欄位（column）所組成。每一個欄位都被指定了特定的資料型別。每一個資料列中欄位的次序是固定的。很重要且必須記得的是，SQL 並不保證資料列在資料表中的次序（雖然他們可以在顯示的時候被明確表現）。

一個資料庫中集合了許多資料表，而很多的資料庫則被一個 PostgreSQL 服務所管理，形成一個資料庫叢集。

2.3. 創建一個新的資料表

你可以創建一個新的資料表，為它取一個名字，並且宣告所有的欄位名稱與其資料型別：

CREATE TABLE weather (
    city            varchar(80),
    temp_lo         int,           -- low temperature
    temp_hi         int,           -- high temperature
    prcp            real,          -- precipitation
    date            date
);

你可以把上述內容在 psql 中輸入，包含換行字元不會影響判讀。psql 是以分號作為指令結束的判定。

空白（包含「空白」、「定位符號」和「換行符號」）都可以自由使用在 SQL 指令當中。這表示你可以將指令以不同的形式排版，甚至全部寫都在一行也沒問題。使用破折號，連續2個（＂--＂），表示緊接的內容只是註解，直到該行結束為止。PostgreSQL 是不分大小寫字母的，包括各類關鍵字和描述語，除非是使用雙引號括起來的文字。（更精確地說，沒有被雙引號括起來的識別字，都會轉為小寫字母進行識別）

varchar(80) 表示指定一個資料型別，它可以儲放任意 80 個字元以內的字串。int 是一般認知的整數型別。real 表示資料是單精確度的浮點數。date 顧名思義，就是日期時間型別。（本例中欄位名稱和型別都使用 date，這可能是方便，也可能是困擾，端看你如何使用。）

PostgreSQL 支援標準的資料型別 int, smallint, real, double precision, char(N), varchar(N), date, time, timestamp, interval，也支援了複合型的地理資料型別。PostgreSQL 可以自訂組合任意數量的資料型別。語法上，資料型別名稱並不是保留關鍵字的範圍，除非特定的標準 SQL 支援需求之外。

第二個例子用來儲存城市及其所在的地理位置：

CREATE TABLE cities (
    name            varchar(80),
    location        point
);

point 型別是一個 PostgreSQL專屬資料型別的範例。

最後，應該被點出來的是，如果你不再需要一個表格，或者想要重新以別的方式創建它，那麼你可以以下列的指令來移除它：

DROP TABLE tablename;

2.4. 資料列是資料表的組成單位

INSERT 指令被用來將資料以資料列（row）的形式，新增至資料表（table）之中：

INSERT INTO weather VALUES ('San Francisco', 46, 50, 0.25, '1994-11-27');

注意，所有的資料型別都有明確的輸入格式。只要不是簡單的數值內容，都必須要以單引號（'）括住，如同在本例中的形式。日期時間型別（date type）的資料內容就比較有彈性，但在這個導覽之中，我們仍然使用較固定的格式來表現。

地理資訊型別（point type）需要有座標組作為輸入，如下所示：

INSERT INTO cities VALUES ('San Francisco', '(-194.0, 53.0)');

到目前為止，語法的使用需要你依照欄位宣告的次序擺放，而另一種語法可以允許你明確地指定資料相對應的欄位：

INSERT INTO weather (city, temp_lo, temp_hi, prcp, date)
    VALUES ('San Francisco', 43, 57, 0.0, '1994-11-29');

你可以將欄位以不同的次序擺放，甚或略去某些欄位，例如，precipitation 欄位（prcp）內容未知：

INSERT INTO weather (date, city, temp_hi, temp_lo)
    VALUES ('1994-11-29', 'Hayward', 54, 37);

許多開發者會認為，在撰寫習慣上，明確指定欄位是比較好的方式。

請執行下列的指令，你將會擁有後續章節所需要的範例資料。

你可能需要使用 COPY 這個指令從文字檔案來載入大量的資料。這個指令會比 INSERT 要快上許多，因為 COPY 指令的設計就是為了大量資料輸入而產生的。它少了一些彈性，但提供了效率上的最佳表現。使用範例如下所示：

資料來源的檔案必須存在於後端的伺服器之中，並且可被 PostgreSQL 使用者（postgres）所存取，注意不是用戶端的主機，因為後端伺服器的服務需要直接讀取該檔案。你可以取得更多詳細說明，在 COPY 指令的說明頁面。

2.5. 資料表的查詢

要從資料表（table）中取出資料，稱作資料表的查詢。要進行這個行為，你需要 SQL 中的 SELECT 指令。這個指令由幾個部份所組成，回傳列表（select list，想要回傳的欄位）、資料表列表（資料來源的資料表）、選擇性的條件定義（指定一些限制條件）。舉個例子來說，要取得資料表 weather 中所有的資料的話，請輸入：

SELECT * FROM weather;

這裡的星號 * 表示「所有欄位」。下列的指令會回傳相同的結果。

SELECT city, temp_lo, temp_hi, prcp, date FROM weather;

其輸出結果將會如下所示：

     city      | temp_lo | temp_hi | prcp |    date
---------------+---------+---------+------+------------
 San Francisco |      46 |      50 | 0.25 | 1994-11-27
 San Francisco |      43 |      57 |    0 | 1994-11-29
 Hayward       |      37 |      54 |      | 1994-11-29
(3 rows)

你可以在回傳列表中撰寫一些運算表示式，而不只是簡單的欄位引用。舉例來說，你可以輸入：

SELECT city, (temp_hi+temp_lo)/2 AS temp_avg, date FROM weather;

這應該會產生這樣的結果：

注意，「AS」被用來重新命名輸出的欄位。（選用）

查詢語句可以加上「WHERE」來設定限制條件，以指定哪些列才需要被回傳。WHERE 的內容是一個布林（truth value）表示式，而只有在其運算值為真（true）時，該列才會被回傳。一般的布林運算子（AND, OR, NOT）都是被允許出現在表示式中的。舉例來說，下列的指令將會回傳 San Francisco 在雨天的天氣數值：

結果：

你可以將結果進行排序：

在這個例子之中，其次序並沒有完全地被指定，所以你可能會得到 San Francisco 的列以另一種次序呈現。而你如果以下列指令查詢的話，那你就會得到如上但固定的結果：

你可以在查詢時去除重覆的列：

再一次，其結果的次序可能每次都不同，你可以同時使用 DISTINCT 及 ORDER BY 來確保能得到一致性的查詢結果：

2.6. 交叉查詢

到目前為止，我們的一個查詢都只涉及到一個資料表。其實可以在同一個查詢中，同時查詢多個資料表，或者在同一個資料表之中同時處理多個資料列的資料。在一個查詢之中，涉及到同一個或多個不同的資料表中的資料，稱作為交叉查詢（join）。舉個例子來說，你希望同時列出天氣和城市位置的資料。要完成這項工作，我們需要關連資料表 weather 中的 city 欄位與表格 cities 中的 name 欄位，然後回傳符合條件的資料。

注意

這只是一個概念式的模形，交叉查詢（join）會以更有效率的方式運行，並非真正需要比較每一種組合是否符合條件，不過這些過程對於使用者而言並不會產生操作或結果上的差異。

下列查詢會產生交叉查詢的結果：

在這個結果中可以觀察到兩件事情：

2.7. 彙總查詢

如同其他的關連式資料庫產品，PostgreSQL 也支援彙總查詢的功能。彙總查詢指的是能夠把多個資料列的資料經過計算，產生單一結果的功能。舉例來說， count、sum、avg（平均值）、max（最大值）、min（最小值）都是彙總查詢的函式。

這裡的例子，我們可以得到所有低溫中的最大值：

如果我們想要知道，這個數值是發生在哪一個城市？也許可以試試：

不過，這行不通，因為 max 不能使用在 WHERE 條件式當中。（會有這樣的限制，是因為 WHERE 條件式目的是要判斷有哪些資料列的資料應該被彙總計算，所以很明顯地，這件事必須要在彙整計算前發生，這就產生了矛盾。）所以，像本例的查詢一般會使用子查詢（subquery）來取得適當的結果：

這樣就對了，因為子查詢是一個獨立的查詢，它可以獨立進行彙總查詢，有別於括號以外的查詢語句。

彙總查詢和 GROUP BY 一起使用會很方便的。舉例來說，我們可以得到每個城市所觀測到的最高氣溫：

這個查詢對每個城市都輸出一列的結果。每一個彙總的結果，將整個資料表，以關連到的城市進行計算。而我們可以進一步過濾資料內容，使用 HAVING：

如果限制所有 temp_lo 的數值必須要小於 40 （WHERE temp_lo < 40）的話，也可能得到相同的結果。最後，如果我們只關心以＂S＂開頭的城市的話，可以這樣做：

2.8. 更新資料

你可以使用 UPDATE 指令以列為單位來更新資料。假設你發現氣溫的數值測量在 11 月 28 日之後都多了 2 度。你可以以下列語法來修正這些資料：

UPDATE weather
    SET temp_hi = temp_hi - 2,  temp_lo = temp_lo - 2
    WHERE date > '1994-11-28';

查看一下這些更新後的資料：

SELECT * FROM weather;

     city      | temp_lo | temp_hi | prcp |    date
---------------+---------+---------+------+------------
 San Francisco |      46 |      50 | 0.25 | 1994-11-27
 San Francisco |      41 |      55 |    0 | 1994-11-29
 Hayward       |      35 |      52 |      | 1994-11-29
(3 rows)

2.9. 刪除資料

要把某些資料列從資料表中移除，就使用 DELETE 這個指令。假設你對於 Hayward 這個城市的天氣不再感興趣了，那麼你可以執行下列指令，來刪除資料表中的這些資料：

所有關於 Hayward 的資料都被刪除了。

這個指令有一個應該要特別注意的情況：

沒有任何限制的條件，DELETE 將會刪去所有該資料表中的資料，使成為空的資料表。資料庫系統並不會在這個動作執行前和你確認！

3. 先進功能

3.1. 簡介

在前面的章節，我們介紹了如何使用 SQL 來存取 PostgreSQL 的基本方式。接下來，我們將會討論更多先進的功能，SQL 的管理功能以及防止資料遺失或損毁。最後，我們也會介紹一些 PostgreSQL 的延伸功能。

這個章節偶爾會引用第 2 章的範例，試著去改寫或是優化他們，所以閱讀過上一章也是很有用的。在這一章中有一些範例是來自於 tutorial 目錄中的 advanced.sql，這個檔案有一些範例資料可以載入，但載入方式在此就不再贅述。（請參閱 2.1 節的內容）

3.2. 檢視表（View）

讓我們回到 2.6 節的查詢範例。假設關連天氣資訊和城市位置的結果，是你的應用中特別常用的，但你並不想要每次都要輸入一長串的查詢語句。那麼，你可以為這個查詢語句建立一個「檢視表（View）」，你可以取一個名字，當你需要使用的時候，你可以把它當作一個資料表來使用：

妥善地使用檢視表，對於良好的 SQL 資料庫設計而言，是很關鍵的部份。檢視表允許你可封裝你的資料表結構與細節，當你的應用系統在逐步發展成熟的過程中，扮演一致性的資料介面。

檢視表可以用在大多數資料表可以使用的地方。而用檢視表來封裝其他檢視表的情況，也不少見。

3.3. 外部索引鍵

回想一下在中的表格 weather 及 cities，思考下列問題：你想要保證沒有另一個人可以新增在 cities 中沒有的城市資料到 weather 中。這就是所謂資料關連性的管理。在簡單的資料庫系統當中，可能會這樣實作：先檢查 cities 中是否已有對應的資料，然後再決定資料表 weather 中新增或拒絕新的天氣資料。這個辦法還有很多問題，而且很不方便，所以 PostgreSQL 可以幫助你解決這個需求。

新的資料表宣告如下所示：

現在嘗試新增一筆不合理的資料：

外部索引鍵或簡稱外部鍵（foreign key）的行為可以讓你的應用程式變得容易調整。我們在這個導覽中不會再深入這個簡單的例子了，但你可以在取得進一步的資訊。正確地使用外部索引鍵，可以改善資料庫應用程式的品質，所以強烈建議一定要好好學習它。

3.4. 交易安全

交易（Transaction），是所有資料庫的基礎概念。基本上來說，一個交易指的是，一系列的執行步驟包裹在一起，其結果只有全部成功或全部失敗兩種情況的操作行為。而其即時的執行狀態，對於其他同時在進行的交易而言，相互之間都是不可見的。如果在執行過程中產生了錯誤而造成整個交易無法完成，那麼所有的指令都不會對資料庫原來的內容產生影響。

舉例來說，某個銀行資料庫存放著各個客戶的存款資訊，也存放著分行的存款總額資訊。假設我們想要轉帳 $100.00，從 Alice 的帳號轉到 Bob 的帳戶。可以很直觀地依敘述，直接以下列指令執行：

這些指令的細節在這裡並不重要，重要的是，有好幾個更新資料的動作要被執行。我們銀行的營業員需要保證所有的更新資料都要完成，或是保持原樣。如果因為系統錯誤，而造成 Bob 收到 $100.00，但 Alice 卻沒有轉出金額，就不是應該發生的事。又或是 Alice 轉出了現金，而 Bob 卻沒有轉入金額，她也不會是開心的客戶。我們需要具有保證交易安全的方法，也就是如果在執行過程中，有部份出了錯，那麼即使是已經執行的部份，也不會對資料庫產生影響。把這些更新資料的指令，包裝在一個交易之中，就是這個保證交易安全的方法。這樣的交易稱作為 atomic：從其他的交易的角度來看，整個行為只有完全執行，亦或是什麼都沒有做，兩種結果而已。

我們也希望有某個保證是，一旦某個交易被完成了，那麼會由資料庫系統發出通知，使它確實是永久性的資料，即使發生短暫的當機之後，資料也不會遺失。舉例來說，如果我們正在進行 Bob 的提款系統操作行為，在他走出銀行大門之後，我們不要有任何可能性使他的提款記錄消失。一個具備交易安全的資料庫，會將這裡交易裡的更新行為，在它們被回報完成之前，都記錄在長效型儲存裝置上（也就是磁碟機）。

交易安全資料庫的另一個重要性質是， atomic update 的概念：當多個交易同時在進行時，每一個交易都不能夠看到其他交易未完成交易的資料狀態。舉個例子，如果某個交易正在進行總計所有分行的餘額，它不會只包含 Alice 的分行的提款，或不計算 Bob 的分行的存款，反之亦然。所以交易必須是全有全無的結果，而不只是資料庫資料的永久性，還包含了交易執行過程的可視性。一個未完成的交易直到完全完成之前，其間資料的改變，對其他的交易而言都看不見；而當交易完成的同時，資料的改變也同時全部呈現出來。

在 PostgreSQL 中，所謂的交易，是以 SQL 的 BEGIN 及 COMMIT 兩個指令相夾的過程。所以我們前述的銀行交易實際上會像這樣：

如果在交易的過程之中，我們決定不要完成交易（也許我們發現 Alice 的帳戶餘額不足），我們可以使用 ROLLBACK 指令來取代 COMMIT，那麼所有資料的變更都會取消。

3.6. 繼承

繼承是一個物件導向資料庫的概念，它開啓了資料庫設計的更多可能性。

讓我們創建兩個資料表：cities 和 capitals。很自然地，首都（capitals）也是城市（cities），所以你希望有個方式，可以在列出所有城市時，同時也包含首都。如果你真的很清楚的話，你可以建立如下的結構：

CREATE TABLE capitals (
  name       text,
  population real,
  altitude   int,    -- (in ft)
  state      char(2)
);

CREATE TABLE non_capitals (
  name       text,
  population real,
  altitude   int     -- (in ft)
);

CREATE VIEW cities AS
  SELECT name, population, altitude FROM capitals
    UNION
  SELECT name, population, altitude FROM non_capitals;

這樣的查詢結果會是正確的，不過它有點不是很漂亮，當你需要更新一些資料的時候。

有一個更好的方法是這樣：

CREATE TABLE cities (
  name       text,
  population real,
  altitude   int     -- (in ft)
);

CREATE TABLE capitals (
  state      char(2)
) INHERITS (cities);

在這個例子中，capitals 繼承了 cities 的所有欄位（name, population, altitude）。欄位 name 的資料型別是文字型別（text），是一個 PostgreSQL 內建的資料型別，它允許字串長度是動態的。然後宣告 capitals 另外多一個欄位，state，以呈現它是屬於哪一個州。在 PostgreSQL，一個資料表可以繼承多個其他的資料格。

舉個例子，下面的查詢可以找出所有的城市名稱，包含各州的首都，而其海拔高過於 500 英呎以上：

回傳結果：

另一方面，下面的查詢可以列出非首都的城市，且其海拔在 500 英呎以上：

這裡的「ONLY」（cities之前），指的是這個查詢只要在資料表 cities 上就好，不包含繼承 cities 其他資料表。這裡許多我們都已經討論的指令 — SELECT、UPDATE、DELETE — 都支援 ONLY 這個修飾字。

注意

雖然繼承經常被使用，但尚未整合唯一性限制或外部索引鍵的功能，這限制了它的可用性。詳情請參考的說明。

3.7. 結論

PostgreSQL 還有許多這份導覽中未能介紹到的功能，這裡主要是針對新鮮的 SQL 使用者所準備的內容。這些功能將會在後續的章節進行更詳細的討論。

如果你覺得你需要更多介紹的資訊，可以到 PostgreSQL 的官方網站取得更多訊息。

II. SQL 查詢語言

在這個部份介紹如何在 PostgreSQL 中使用 SQL 語言。首先，我們從一般性的 SQL 語法開始說明，然後解釋如何建立結構來保存資料，如何充實資料庫，以及如何查詢資料的方法。中段的部份列出 SQL 指令中的資料型別與函數。最後剩餘的部份，將會針對一些調教資料庫的重要議題進行說明。

這個部份的內容設計讓初學者可以循序漸進地完整瞭解該主題，而不需要反覆前後查閱。各章的內容設計上都是獨立的，所以進階的使用者可以分別閱讀他們需要的部份。在這個部份的內容，針對於主題式的單元描述。需要瞭解詳情的讀者，請參閱第 6 部份中，個別指令的說明頁面。

在這個部份裡的讀者，應該要知道如何連線到一個 PostgreSQL 資料庫，並且執行 SQL 指令。如果不熟悉這些操作的讀者，建議先閱讀第 1 部份的內容。SQL 指令一般是使用終端工具 psql，但其他具有類似功能的程式也可以使用。

4. SQL 語法

這章中說明 SQL 的使用語法。從這裡建立後續章節所需的理解基礎，然後進一步瞭解 SQL 如何使用去定義及修改資料。

我們也建議已經熟悉 SQL 語法的使用者，仔細地閱讀本章，因為這裡包含了一些有別於其他 SQL 資料庫或專屬於 PostgreSQL 的規則和觀念。

5. 定義資料結構

這一章涵蓋了如何建立資料庫結構。在關連式資料庫中，原始資料儲存在表格之中，所以在這一章裡，主要說明表格如何建立及調整，以及有什麼樣的功能可以操控所存放的資料。再來我們會討論表格如何以結構來管理，以及權限的設定。最後，我們會簡短地看一下其他的功能如何影響資料儲存，像是繼承、表格分割、view、函數、還有觸發函數。

5.2. 預設值

欄位可以指定一個預設值。當新的列被插入，某些欄位卻沒有指定其值時，這些欄位將會被填入相對應的預設值。資料處理的過程中，當有欄位的值不確定時，也會被設定為其預設值。（關於資料處理的詳細內容，請參閱。）

如果預設值並沒有明確被指定時，預設值就會是 null。一般來說空值是可接受的情況，因為空值可以表示「未知的資料」的意義。

在表格定義時，預設值接在資料型別後宣告，如下所示：

預設值也可以是運算表示式，會在資料插入的同時進行運算（不是在表格建立時）。常見的例子是 timestamp 欄位，會設定一個 CURRENT_TIMESTAMP 的預設值，使其在資料插入時設定為當下的時間。另一個例子是產生「序列數」，這在 PostgreSQL 中，通常以下列語法來表現：

這裡的 nextval() 函數會從序列物件取得下一個數字（參閱）。這個例子也常簡化為：

有關 SERIAL 的簡寫方式，將在中說明。

5.3. Generated Columns

Generated column (自動欄位)是特殊的欄位，它的內容由其他欄位的內容計算得出。相對於資料表來說，就是欄位形態的 View。Generated column 有兩種：stored 和 virtual。 Stored 的自動欄位在寫入（插入或更新）時進行計算，會像正常欄位一樣佔用儲存空間。Virtual 的自動欄位則不佔用任何儲存空間，而是在讀取時會對其進行計算。因此，虛擬的自動欄位類似於檢視表(view)，而儲存的自動欄位則類似於具體化檢視表(materialized view)（但會自動更新）。 PostgreSQL 目前僅實作了儲存的自動欄位。

To create a generated column, use the GENERATED ALWAYS AS clause in CREATE TABLE, for example:

必須指定關鍵字 STORED 來作為自動欄位的儲存型別。相關的詳細說明，請參閱。

自動欄位無法直接寫入資料。在 INSERT 或 UPDATE 命令中，不能為自動欄位指定內容，但可以指定關鍵字 DEFAULT。

Consider the differences between a column with a default and a generated column. The column default is evaluated once when the row is first inserted if no other value was provided; a generated column is updated whenever the row changes and cannot be overridden. A column default may not refer to other columns of the table; a generation expression would normally do so. A column default can use volatile functions, for example random() or functions referring to the current time; this is not allowed for generated columns.

5.5. 系統欄位

每一個表格都有幾個系統欄位，而它們是由資料庫系統預先定義好的，所以使用者在定義欄位名稱時，不能使用這些名字。（這些限制並不是因為它們是保留關鍵字，所以就算用引號括起來也不能使用。）但在一般使用時，你也不需要特別考慮這些欄位，只要瞭解會有這些欄位存在就好。

oid

每一個資料列會有一個 Object ID，不過這個欄位只有在建立表格時，加上 WITH OIDS 語法才能使用。或者也可以藉由參數來切換使用。這個欄位的型別是 oid（和欄位名相同）。參閱瞭解詳細資訊。

tableoid

每個表格也有一個 ID 也會記錄在每一個資料列中。這個欄位特別方便在取得表格的繼承結構（參閱），如果沒有這個欄位的話，要去找出資料列的來源就會很麻煩。tableoid 可以參考 pg_class 表格中的 oid 欄位，進一步取得表格的名稱。

xmin

這指的是資料列在插入資料的版本資訊。（每一個資料列的版本，都是一個獨立的資料狀態；每一次資料的更新，都會在邏輯層產生一個新的資料列版本。）

5.7. 權限

當一個資料庫物件被建立時，它會先指定存取權限給擁有者，而擁有者一般來說就是執行建立指令的使用者。對大多數的資料庫物件來說，其預設的狀態就是只有擁有者（或超級使用者）可以對該物件進行所有操作。要讓給其他使用者來操作的話，就必須進行授權的動作。

有很多不同種類的權限：SELECT、INSERT、UPDATE、DELETE、TRUNCATE、REFERENCES、TRIGGER、CREATE、CONNECT、TEMPORARY、EXECUTE、USAGE。這些權限對於不同物件的效果，會因為是哪一種物件而有所差別（表格、函式...等等）。要瞭解完整在 PostgreSQL 中所支援的各種物件權限，請參考語法頁面。這裡的內容主要說明如何使用。

修改和移除一個資料庫物件，是只有擁有者才具備的權力。

要把一個物件被指派給一個新的擁有者的話，使用該物件的 ALTER 指令，例如：。超級使用者也可以做指派的動作；原來的擁有者如果它仍是該物件的管理群組一員的話，當然也可以；再來就管理群組新的成員。

要進行授權行為的話，請使用 GRANT 指令。舉例來說，如果 joe 是一個使用者，而 accounts 是一個表格，要讓他可以獲得更新表格資料的權力：

使用 ALL 的權限，就代表授權所有可設定的權限。

有一個特別的使用者是 PUBLIC，代表的是系統內的所有使用者。當資料庫內有很多使用者時，可以制定「群組（group）」來簡化管理。這部份詳細的說明請參閱。

要移除權限，請使用 REVOKE 指令：

5.12. 外部資料

PostgreSQL 實作了 SQL/MED 的部份標準，讓你可以存取不在 PostgreSQL 管理下的資料，重點是，你仍然只需要使用 SQL 語法。這樣的資料我們稱作為外部資料。（注意這部份的使用不要和外部鍵搞混了，外部鍵是資料庫內部的一種條件限制。）

外部資料的存取是透過「Foreign data wrapper」（外部資料封裝技術）。外部資料封裝技術是一組函式庫，用於和外部的資料源溝通，它封裝了資料連線和存取資料的細節。有一些外部資料封裝的套件收錄在 contrib 模組之中，參閱附件 F。其他種類的外部封裝套件則由第三方產品提供。如果沒有適合你的資料源的套件的話，你也可以自己寫一個，參閱第 56 章。

要存取外部資料，你需要建立外部服務物件，用它來連結特定的外部資料源，也可以對套件進行一些設定。然後你還需要建立幾個外部資料表，用於定義外部資料的資料結構。外部資料表的使用就如一般的表格一樣，只不過它沒有實際儲存任何資料罷了。當外部資料表被查詢時，PostgreSQL 會透過外部資料封裝套件，從外部資料源取得資料，或者傳送資料到外部，進行更新資料。

存取外部資料可能需要對外部資料源進行認證。這可以利用使用者映對（user mapping）的方法，讓每個 PostgreSQL 使用者在使用部資料表時，可以傳送自己的認證資訊。

進一步的資訊，請參閱、、、、等內容。

5.13. 其他資料庫物件

表格是關連式資料庫結構裡的主要物件，因為它負責存放資料，但並不是資料庫中唯一的物件。還有許多其他種的物件存在，讓使用上更方便或管理更有效率。這些其他的物件並不在本章中討論，但我們先在這裡列出讓你知道：

視觀
函數與運算子
資料型別和領域
觸發事件和規則覆寫

關於這些物件的詳細說明安排在。

5.14. 相依性追蹤

當你建立了一個複雜的資料庫結構，包含了許多資料表，也設計了許多外部索引鍵、檢視表、觸發事件、函數.....等等。也就是說，其實你建立了一堆物件之間的關連性。舉例來說，資料表的外部索引鍵就與另一個資料表有著參考的關連性。

要維護整個資料庫結構的完整性，PostgreSQL 得確保你不能在有關連性的情況下，隨意刪去物件。舉例來說，企圖刪去在中，我們所使用過的產品資料表，而訂單資料表與其有相依的關連性，那就會產生如下的錯誤訊息：

這個錯誤訊息包含了很有用的指引：如果你不想要一個個處理其相依關連性，那可以一次刪去他們：

如此所有相依的物件就會被刪除了，所有相互依存的物件都會，是遞迴式的處理流程。在這個例子中，它不會移除訂單資料表，只會移除外部索引鍵的限制條件，因為沒有其他物件與該外部索引鍵相依。（如果你要確認 DROP ... CASCADE 會處理哪些物件，你可以用 DETAIL 取代 CASCADE，就會輸出其相依的物件。）

幾乎所有 PostgreSQL 的 DROP 指令都支援 CASCADE 的用法。當然，有些自然的關連性是和物件型別有關。你也可以使用 RESTRICT 來取代 CASCADE 的位置，以強制以預設的行為來處理，也就是絕對不會刪去其他相關的物件。

6. 資料處理

前一章討論了如何建立資料表和其他結構來保存資料。現在是把資料表填滿的時候了。本章介紹如何新增、更新和刪除資料表的資料。下一章將會完整說明如何從資料庫中取回你遺落在裡面的資料。

6.1. 新增資料

資料表在建立的時候，並不包含任何資料。以各種方式使用資料庫之前，要做的第一件事就是新增資料。概念上，資料是一次新增一列。當然你也可以新增多列，但就沒有辦法新增少於一列。即使只知道某些欄位的值，也必須建立一個完整的資料列。

要建立新的資料列，請使用指令。該命令需要資料表的名稱和各欄位的資料內容。例如，來看看中的產品資料表：

新增資料列的指令可能如下所示：

資料內容按資料表表中欄位的順序列出，以逗號分隔。通常，資料內容會是文字（常數），但運算表示式也是允許的。

上面的語法有缺點，就是你需要知道資料表中欄位的順序。為了避免這種情況，您可以明確地列出欄位。例如，以下兩個命令與上面的命令具有相同的效果：

許多用戶認為總是列出欄位名稱是一個很好的習慣。

如果你並沒有所有欄位的內容，則可以省略其中一些欄位。在這種情況下，那些欄位將會以預設值代入。如下所示：

第二種形式是屬於 PostgreSQL 延伸寫法。從左邊開始的欄位填入所給定的內容，其餘的欄位則使用預設值。

為了清楚起見，你也可以明確地指定個別欄位或整個資料列都使用預設值：

6.2. 更新資料

將已經在資料庫中的資料做修改被稱為更新。您可以單獨更新某個資料列，或資料表中的所有資料列，或是部份資料列。每個欄位可以單獨更新，而不影響其他欄位。

要更新現有的資料列，請使用指令。這需要三種資訊：

要更新的資料表和欄位的名稱
資料欄位新的內容
哪些資料列要更新

6.3. 刪除資料

到目前為止，我們已經解釋瞭如何將資料新增到資料表以及如何更新資料了。剩下的就是討論如何刪除不再需要的資料。正如新增資料時只能新增整個資料列一樣，你只能從資料表中以資料列為單位刪除資料。在前面的章節中，我們解釋了SQL沒有提供直接處理某個資料列的方法。因此，只能透過指定要刪除的行必須符合的條件來刪除指定的資料列。如果資料列中有主鍵，則可以指定確切的資料列。但是，你也可以刪除全部符合條件的資料列，更可以一次刪除資料表中的所有資料列。

您使用 DELETE 指令刪除資料列；該語法與 UPDATE 指令十分類似。例如，要從產品表中刪除價格為 10 的所有資料列，請使用：

DELETE FROM products WHERE price = 10;

如果你只是寫：

DELETE FROM products;

那麼資料表中的所有資料列都將被刪除！請程式設計師一定要小心使用。

6.4. 修改並回傳資料

有時在修改資料列的操作過程中取得資料是很方便的。INSERT、UPDATE 和 DELETE 指令都有一個選擇性的RETURNING 子句來支持這個功能。使用 RETURNING 可以避免執行額外的資料庫查詢來收集資料，特別是在難以可靠地識別修改的資料列時尤其有用。

RETURNING 子句允許的語法與 SELECT 指令的輸出列表相同（詳見第 7.3 節）。它可以包含命令目標資料表的欄位名稱，或者包含使用這些欄位的表示式。常用的簡寫形式是 RETURNING *，預設是資料表的所有欄位，且相同次序。

在 INSERT 中，可用於 RETURNING 的資料是新增的資料列。這在一般的資料新增中並不是很有用，因為它只會重複用戶端所提供的資料。但如果是計算過的預設值就會非常方便。例如，當使用串列欄位（serial）提供唯一識別時，RETURNING 可以回傳分配給新資料列的 ID：

CREATE TABLE users (firstname text, lastname text, id serial primary key);

INSERT INTO users (firstname, lastname) VALUES ('Joe', 'Cool') RETURNING id;

對於 INSERT ... SELECT，RETURNING 子句也非常有用。

在 UPDATE 中，可用於 RETURNING 的資料是被修改的資料列新內容。例如：

UPDATE products SET price = price * 1.10
  WHERE price <= 99.99
  RETURNING name, price AS new_price;

在 DELETE 中，可用於 RETURNING 的資料是已刪除資料列的內容。例如：

DELETE FROM products
  WHERE obsoletion_date = 'today'
  RETURNING *;

如果目標資料表上有觸發函數的話（第 38 章），則可用於 RETURNING 的資料是由該觸發函數所修改的資料列。因此，由觸發函數計算檢查欄位是 RETURNING 的另一個常見用法。

7. 資料查詢

前面的章節解釋了如何建立資料表，如何填入資料以及如何操作這些資料。現在我們是時候討論如何從資料庫中檢索資料了。

7.1. 概觀

檢索過程或從資料庫檢索資料的命令稱之為查詢。在 SQL 中，SELECT 命令用於進行條件查詢。 SELECT 指令的一般語法是：

[WITH with_queries] SELECT select_list FROM table_expression [sort_specification]

以下各節介紹了資料列表（select list），資料表和排序規則的詳細資訊。由於 WITH 查詢是高級功能，因此最後再介紹。

一種簡單的查詢形式如下：

SELECT * FROM table1;

假設有一個名稱為 table1 的資料表，該指令會將取出 table1 中的所有資料表和所有用戶定義的欄位。（檢索的方法取決於用戶端的應用程序，例如，psql 程序將在屏幕上顯示一個 ASCII-art 表格，而用戶端的程式函式庫將提供從查詢結果中提取單一值的功能。選擇資料列表定義「*」表示由資料表表示式所產生的所有欄位。篩選列表可以是可用欄位的子集或使用欄位進行計算。例如，如果 table1 具有名稱為 a，b 和 c（也許是其他）的欄位，則可以進行以下查詢：

SELECT a, b + c FROM table1;

（假設 b 和 c 是數字型別）。更多細節詳見 7.3 節。

FROM table1是一種簡單的資料表表示式：它只讀取一個資料表。一般來說，資料表表示式可以是一般的資料表，交叉查詢和子查詢的複雜結構。但是，你也可以完全省略資料表表示式，並使用 SELECT 指令作為計算機：

SELECT 3 * 4;

使用資料列表中的表達式產生變動的結果，是更為常用的方式。例如，你可以這樣呼叫一個函數：

7.3. 取得資料列表

如前一節所述，SELECT 指令中的資料示表表示式透過各種可能地組合資料表、view、消除資料列、分組等來建構中介的虛擬資料表。這個資料表最終會被傳遞給資料列表的處理。資料列表確認中介資料表的哪些欄位是實際上要輸出的。

7.3.1. 資料列表項目

最簡單的選擇列表是*，它表示資料表表示式產生的所有欄位。否則，資料列表是逗號分隔的參數表示式列表（如中所定義的）。例如，它可能是欄位名稱的列表：

欄位名稱 a、b 和 c 是 FROM 子句中資料表的欄位的實際名稱，或者是由中所賦予它們的別名。資料列表中可用的命名空間與 WHERE 子句中的命名空間相同，除非是使用分組查詢，在這種情況下，它與 HAVING 子句中的相同。

如果多個資料表具有相同名稱的欄位，則還必須加上資料表的名稱，如下所示：

處理多個資料表時，查詢特定資料表的所有欄位也是可以的：

7.4. 合併查詢結果

兩個查詢的結果可以使用集合操作聯、交集和差集來組合。其語法為：

query1 和 query2 是到目前為止討論過的任何查詢功能。集合操作也可以巢狀也可以連接，例如：

會如下方式執行：

UNION 將 query2 的結果有效率地附加到 query1 的結果中（但不能保證這是實際回傳資料列的次序）。此外，除非使用了UNION ALL，否則它將以與 DISTINCT相同的方式從結果中消除重複的資料列。

INTERSECT 返回 query1 的結果和 query2 的結果中所有共同的資料列。除非使用 INTERSECT ALL，否則會刪除重複的資料列。

EXCEPT 回傳 query1 的結果中但不包含在 query2 的結果中的所有資料列。（這有時被稱為兩個查詢之間的差集。）同樣地，除非使用 EXCEPT ALL，否則重複資料列將被刪除。

為了計算兩個查詢的聯集、交集或差集，兩個查詢必須是「union compatible」，這意味著它們回傳相同數量的欄位，相應的欄位具有相容的資料型別，如所述。

7.5. 資料排序

在查詢產生了一個輸出資料表（處理了資料列表之後）之後，可以對其資料列進行排序。如果未選擇排序，則資料列將以未指定的順序回傳。在這種情況下的實際順序將取決於資料掃描和交叉查詢類型以及磁碟上的順序，但不能依賴它。只有明確選擇了排序方式，才能保證特定的輸出排序。

以 ORDER BY 子句指定排序順序：

SELECT select_list
    FROM table_expression
    ORDER BY sort_expression1 [ASC | DESC] [NULLS { FIRST | LAST }]
             [, sort_expression2 [ASC | DESC] [NULLS { FIRST | LAST }] ...]

排序表示式可以在查詢的資料列表中有效的任何表示式。一個例子是：

SELECT a, b FROM table1 ORDER BY a + b, c;

當指定多個表示式時，後面的表示式用於前面表示式都相同的資料進行排序。每個表示式可以跟隨一個選擇性的 ASC 或 DESC 關鍵字來設定排序方向為升冪或降冪。 ASC 排序是預設的選項。升冪首先放置較小的值，其中「較小」是根據「<」運算元定義的。同樣，降冪也是由「>」運算元決定的。

NULLS FIRST 和 NULLS LAST 選項可用於確定在排序順序中是否出現空值出現在非空值之前或之後。預設情況下，空值排序大於任何非空值；也就是 NULLS FIRST 是 DESC 選項的預設值，否則就是 NULLS LAST。

請注意，排序選項是針對每個排序欄位獨立考慮的。例如 ORDER BY x, y DESC 是指 ORDER BY x ASC, y DESC，它與 ORDER BY x DESC, y DESC 不同。

排序表示式也可以是輸出欄位的欄位標籤或編號，如下所示：

SELECT a + b AS sum, c FROM table1 ORDER BY sum;
SELECT a, max(b) FROM table1 GROUP BY a ORDER BY 1;

兩者都按第一個輸出欄位排序。請注意，輸出欄位名稱必須獨立，也就是說，不能在表示式中使用 - 例如，這樣是不正確的：

SELECT a + b AS sum, c FROM table1 ORDER BY sum + c;          -- 錯誤

這種限制是為了減少歧義。即使 ORDER BY 項目是一個簡單的名字，可以匹配輸出欄位名稱或者資料表表示式中的一項，這仍然是會混淆的。在這種情況下請使用輸出欄位。如果您使用 AS 來重新命名輸出欄位以匹配其他資料表欄位的名稱，只會導致混淆。

可以將 ORDER BY 應用於 UNION、INTERSECT 或 EXCEPT 組合的結果，但在這種情況下，只允許按輸出欄位名稱或數字進行排序，而不能使用表示式進行排序。

7.6. LIMIT 和 OFFSET

LIMIT 和 OFFSET 允許你只回傳由查詢生成的一部分資料列：

SELECT select_list
    FROM table_expression
    [ ORDER BY ... ]
    [ LIMIT { number | ALL } ] [ OFFSET number]

如果給了一個限制的數量，那麼只有那個數目的資料列會回傳（如果查詢本身產生較少的資料列，則可能會少一些）。LIMIT ALL 與省略 LIMIT 子句相同，也如同 LIMIT 的參數為 NULL。

OFFSET 指的是在開始回傳資料列之前跳過那麼多少資料列。OFFSET 0 與忽略 OFFSET 子句相同，就像使用 NULL 參數的 OFFSET 一樣。

如果同時出現 OFFSET 和 LIMIT，則在開始計算回傳的LIMIT 資料列之前，先跳過 OFFSET 數量的資料列。

使用 LIMIT 時，運用 ORDER BY 子句將結果資料列限制為唯一順序非常重要。否則，你會得到一個不可預知的查詢資料列的子集。你可能會查詢第十到第二十個資料列，但是第十到第二十個資料列是按什麼順序排列的？次序是未知的，除非你指定 ORDER BY。

查詢最佳化在產生查詢計劃時會將 LIMIT 考慮在內，所以根據你給的 LIMIT 和 OFFSET，你很可能會得到不同的計劃（產生不同的資料列順序）。因此，使用不同的 LIMIT / OFFSET 值來選擇查詢結果的不同子集將導致不一致的結果，除非使用 ORDER BY 強制執行可預測的結果排序。這不是一個錯誤；這是一種事實上的結果，即 SQL 不保證以任何特定順序傳遞查詢的結果，除非使用 ORDER BY 來約束順序。

由 OFFSET 子句跳過的資料列仍然需要在伺服器內計算。因此一個大的 OFFSET 可能是低效率的。

7.7. VALUES 列舉資料

VALUES 提供了一種產生「靜態資料表」的方法，可以在查詢中使用，而不必實際創建和寫入磁碟上的資料表。其語法是

VALUES ( expression [, ...] ) [, ...]

每個括號內的表示式列表在資料表中生成一個資料列。列表必須具有相同數量的元素（即資料表中的欄位數），並且每個列表中的對應條目必須具有兼容的資料型別。分配給結果中每個欄位的實際資料型別，使用與 UNION 相同的規則來給定（請參閱第 10.5 節）。

如下範例所示：

VALUES (1, 'one'), (2, 'two'), (3, 'three');

將回傳一個兩個欄位三個資料列的資料表。這實際上相當於：

SELECT 1 AS column1, 'one' AS column2
UNION ALL
SELECT 2, 'two'
UNION ALL
SELECT 3, 'three';

預設情況下，PostgreSQL 會將名稱 column1、column2 等分配給 VALUES 資料表的欄位。欄位名稱並不是由 SQL 標準規定的，不同的資料庫系統會以不同的方式賦予，所以通常以資料表別名列表覆寫預設名稱會比較好，如下所示：

在語法上，VALUES 接在表示式列表之後被視為等同於：

並可以出現在任何一個 SELECT 可以使用的地方。例如，你可以將其用作為 UNION 的一部分，或者為其增加排序規則（ORDER BY、LIMIT 和 OFFSET）。在 INSERT 命令中，VALUES 最常來作為資料源，其次最常在子查詢。

關於更多訊息，請參閱 VALUES。

8.2. 貨幣型別

貨幣型別儲存具有固定小數精確度的貨幣數量；詳見表 8.3。小數精確度視資料庫的設定而定。表中顯示的範圍假設有兩個小數位。有許多可以接受的格式，包括整數和浮點數字，以及典型的貨幣格式，例如如「$1,000.00」。輸出時通常採用後者的形式，但取決於語言環境（locale）。

Table 8.3. Monetary Types

Name

Storage Size

Description

Range

8.6. 布林型別

PostgreSQL 支援標準 SQL 的布林型別，如所示。布林型別有幾種狀態: "true"、"false"，和第三種狀態 "unknown"，"unknown" 會用 SQL 的 null 值表示。

Table 8-19. 布林型別的資料型態描述

Name

Storage Size

Description

8.10. 位元字串型別

Bit strings are strings of 1's and 0's. They can be used to store or visualize bit masks. There are two SQL bit types: bit(n) and bit varying(n), where n is a positive integer.

bit type data must match the length n exactly; it is an error to attempt to store shorter or longer bit strings. bit varying data is of variable length up to the maximum length n; longer strings will be rejected. Writing bit

8.12. UUID 型別

資料型別 uuid 儲存由 RFC 4122、ISO/IEC 9834-8:2005 和相關標準定義的通用唯一識別字 (Universally Unique IDentifiers, UUID)。（有些系統將此資料型別稱為 Globally Unique IDentifier 或 GUID。）此識別字是一個 128 位元的數字，由所選擇演算法產生，以確保其他任何人在已知的情況下使用相同的演算法都不太可能產生相同的識別字。因此，對於分散式系統，這些識別字提供了比序列產生器更好的唯一性保證，序列產生器僅在單一資料庫中確保唯一性。

一個 UUID 寫成一系列小寫的十六進位數字，由連接字元分隔為幾組，特別是一組 8 位數字後跟三組 4 位數字後跟一組 12 位數字，總共 32 位數字代表 128 位元。此標準形式的 UUID 範例是：

a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11

PostgreSQL 還接受以下替代形式的輸入方式：使用大寫數字、用大括號括起來的標準格式、省略部分或全部連接字元、在任何一組四位數字後加上連接字元。一些例子如下：

A0EEBC99-9C0B-4EF8-BB6D-6BB9BD380A11
{a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11}
a0eebc999c0b4ef8bb6d6bb9bd380a11
a0ee-bc99-9c0b-4ef8-bb6d-6bb9-bd38-0a11
{a0eebc99-9c0b4ef8-bb6d6bb9-bd380a11}

Output is always in the standard form.

有關如何在 PostgreSQL 中產生 UUID，請參閱第 9.14 節。

8.18. Domain Types

Domain 是基於另一個基本型別的使用者定義資料型別。可以選擇性將其有效值限制為基本型別的子集。否則，它的行為類似於基本型別 — 例如，可以應用於基本型別的任何運算子或函數都將在 domain 型別上以相同的行為運作。基本型別可以是任何內建或其他使用者定義的基本型別、列舉型別、陣列型別、複合型別、範圍型別或其他的 domain。

例如，我們可以整數型別上建立一個 domain，其僅接受正數：

當基本型別的運算子或函數運作於 domain 的值時，該 domain 會自動向下轉換為基本型別。因此，例如，mytable.id - 1 的結果被認為是整數型別而不是 posint 型別。我們可以使用 (mytable.id - 1)::posint 將結果轉換回 posint，使其重新檢查 domain 的條件。在這種情況下，如果將表示式於 ID 值 1 做運算，則會產生錯誤。可以在不明確寫出強制型別轉換的情況下，將基本型別的值寫入到 domain 型別的欄位或變數，但是將會檢查該 domain 的限制條件。

有關更多資訊，請參閱。

8.20. pg_lsn 型別

pg_lsn 資料型別用於儲存 LSN（日誌序列編號）資料，該資料是指向 WAL 中某個位置的指標。此型別用於表示 XLogRecPtr，並且是 PostgreSQL 的內部系統型別。

Internally, an LSN is a 64-bit integer, representing a byte position in the write-ahead log stream. It is printed as two hexadecimal numbers of up to 8 digits each, separated by a slash; for example, 16/B374D848. The pg_lsn type supports the standard comparison operators, like = and >. Two LSNs can be subtracted using the - operator; the result is the number of bytes separating those write-ahead log locations.

9. 函式及運算子

PostgreSQL 為內建的資料型別提供了大量的函數和運算子。使用者還可以定義自己的函數和運算子，如第 V 部分所述。psql 指令 \df 和 \do 可分別用於列出所有可用的函數和運算子。

The notation used throughout this chapter to describe the argument and result data types of a function or operator is like this:

repeat ( text, integer ) → text

which says that the function repeat takes one text and one integer argument and returns a result of type text. The right arrow is also used to indicate the result of an example, thus:

repeat('Pg', 4) → PgPgPgPg

如果您擔心可移植性，那麼請注意，本章中描述的大多數函數和運算子（最常見的算術運算子和比較運算子以及一些明確標記的函數除外）都不是由 SQL 標準指定的。其他一些 SQL 資料庫管理系統提供了其中一些延伸功能，並且在許多情況下，這些功能在各種實作之間是相容和一致的。本章可能不夠完整；附加功能出現在手冊的其他相關章節中。

9.10. 列舉型別函式

For enum types (described inSection 8.7), there are several functions that allow cleaner programming without hard-coding particular values of an enum type. These are listed inTable 9.32. The examples assume an enum type created as:

CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple');

Table 9.32. Enum Support Functions

Function

Description

Example

Example Result

enum_first(anyenum)

Returns the first value of the input enum type

enum_first(null::rainbow)

Notice that except for the two-argument form ofenum_range, these functions disregard the specific value passed to them; they care only about its declared data type. Either null or a specific value of the type can be passed, with the same result. It is more common to apply these functions to a table column or function argument than to a hardwired type name as suggested by the examples.

8.16. 複合型別

A composite type represents the structure of a row or record; it is essentially just a list of field names and their data types. PostgreSQL allows composite types to be used in many of the same ways that simple types can be used. For example, a column of a table can be declared to be of a composite type.

8.16.1. Declaration of Composite Types

Here are two simple examples of defining composite types:

The syntax is comparable to CREATE TABLE, except that only field names and types can be specified; no constraints (such as NOT NULL) can presently be included. Note that the AS keyword is essential; without it, the system will think a different kind of CREATE TYPE command is meant, and you will get odd syntax errors.

Having defined the types, we can use them to create tables:

or functions:

Whenever you create a table, a composite type is also automatically created, with the same name as the table, to represent the table's row type. For example, had we said:

then the same inventory_item composite type shown above would come into being as a byproduct, and could be used just as above. Note however an important restriction of the current implementation: since no constraints are associated with a composite type, the constraints shown in the table definition do not apply to values of the composite type outside the table. (To work around this, create a domain over the composite type, and apply the desired constraints as CHECK constraints of the domain.)

8.16.2. Constructing Composite Values

To write a composite value as a literal constant, enclose the field values within parentheses and separate them by commas. You can put double quotes around any field value, and must do so if it contains commas or parentheses. (More details appear .) Thus, the general format of a composite constant is the following:

An example is:

which would be a valid value of the inventory_item type defined above. To make a field be NULL, write no characters at all in its position in the list. For example, this constant specifies a NULL third field:

If you want an empty string rather than NULL, write double quotes:

Here the first field is a non-NULL empty string, the third is NULL.

(These constants are actually only a special case of the generic type constants discussed in . The constant is initially treated as a string and passed to the composite-type input conversion routine. An explicit type specification might be necessary to tell which type to convert the constant to.)

The ROW expression syntax can also be used to construct composite values. In most cases this is considerably simpler to use than the string-literal syntax since you don't have to worry about multiple layers of quoting. We already used this method above:

The ROW keyword is actually optional as long as you have more than one field in the expression, so these can be simplified to:

The ROW expression syntax is discussed in more detail in .

8.16.3. Accessing Composite Types

To access a field of a composite column, one writes a dot and the field name, much like selecting a field from a table name. In fact, it's so much like selecting from a table name that you often have to use parentheses to keep from confusing the parser. For example, you might try to select some subfields from our on_hand example table with something like:

This will not work since the name item is taken to be a table name, not a column name of on_hand, per SQL syntax rules. You must write it like this:

or if you need to use the table name as well (for instance in a multitable query), like this:

Now the parenthesized object is correctly interpreted as a reference to the item column, and then the subfield can be selected from it.

Similar syntactic issues apply whenever you select a field from a composite value. For instance, to select just one field from the result of a function that returns a composite value, you'd need to write something like:

Without the extra parentheses, this will generate a syntax error.

The special field name * means “all fields”, as further explained in .

8.16.4. Modifying Composite Types

Here are some examples of the proper syntax for inserting and updating composite columns. First, inserting or updating a whole column:

The first example omits ROW, the second uses it; we could have done it either way.

We can update an individual subfield of a composite column:

Notice here that we don't need to (and indeed cannot) put parentheses around the column name appearing just after SET, but we do need parentheses when referencing the same column in the expression to the right of the equal sign.

And we can specify subfields as targets for INSERT, too:

Had we not supplied values for all the subfields of the column, the remaining subfields would have been filled with null values.

8.16.5. Using Composite Types in Queries

There are various special syntax rules and behaviors associated with composite types in queries. These rules provide useful shortcuts, but can be confusing if you don't know the logic behind them.

In PostgreSQL, a reference to a table name (or alias) in a query is effectively a reference to the composite value of the table's current row. For example, if we had a table inventory_item as shown , we could write:

This query produces a single composite-valued column, so we might get output like:

Note however that simple names are matched to column names before table names, so this example works only because there is no column named c in the query's tables.

The ordinary qualified-column-name syntax table_name.column_name can be understood as applying to the composite value of the table's current row. (For efficiency reasons, it's not actually implemented that way.)

When we write

then, according to the SQL standard, we should get the contents of the table expanded into separate columns:

as if the query were

PostgreSQL will apply this expansion behavior to any composite-valued expression, although as shown , you need to write parentheses around the value that .* is applied to whenever it's not a simple table name. For example, if myfunc() is a function returning a composite type with columns a, b, and c, then these two queries have the same result:

Tip

PostgreSQL handles column expansion by actually transforming the first form into the second. So, in this example, myfunc() would get invoked three times per row with either syntax. If it's an expensive function you may wish to avoid that, which you can do with a query like:

Placing the function in a LATERAL FROM item keeps it from being invoked more than once per row. m.* is still expanded into m.a, m.b, m.c, but now those variables are just references to the output of the FROM item. (The LATERAL keyword is optional here, but we show it to clarify that the function is getting x from some_table.)

The composite_value.* syntax results in column expansion of this kind when it appears at the top level of a , a in INSERT/UPDATE/DELETE, a , or a . In all other contexts (including when nested inside one of those constructs), attaching .* to a composite value does not change the value, since it means “all columns” and so the same composite value is produced again. For example, if somefunc() accepts a composite-valued argument, these queries are the same:

In both cases, the current row of inventory_item is passed to the function as a single composite-valued argument. Even though .* does nothing in such cases, using it is good style, since it makes clear that a composite value is intended. In particular, the parser will consider c in c.* to refer to a table name or alias, not to a column name, so that there is no ambiguity; whereas without .*, it is not clear whether c means a table name or a column name, and in fact the column-name interpretation will be preferred if there is a column named c.

Another example demonstrating these concepts is that all these queries mean the same thing:

All of these ORDER BY clauses specify the row's composite value, resulting in sorting the rows according to the rules described in . However, if inventory_item contained a column named c, the first case would be different from the others, as it would mean to sort by that column only. Given the column names previously shown, these queries are also equivalent to those above:

(The last case uses a row constructor with the key word ROW omitted.)

Another special syntactical behavior associated with composite values is that we can use functional notation for extracting a field of a composite value. The simple way to explain this is that the notations field(table) and table.field are interchangeable. For example, these queries are equivalent:

Moreover, if we have a function that accepts a single argument of a composite type, we can call it with either notation. These queries are all equivalent:

This equivalence between functional notation and field notation makes it possible to use functions on composite types to implement “computed fields”. An application using the last query above wouldn't need to be directly aware that somefunc isn't a real column of the table.

Tip

Because of this behavior, it's unwise to give a function that takes a single composite-type argument the same name as any of the fields of that composite type. If there is ambiguity, the field-name interpretation will be chosen if field-name syntax is used, while the function will be chosen if function-call syntax is used. However, PostgreSQL versions before 11 always chose the field-name interpretation, unless the syntax of the call required it to be a function call. One way to force the function interpretation in older versions is to schema-qualify the function name, that is, write schema.func(compositevalue).

8.16.6. Composite Type Input and Output Syntax

The external text representation of a composite value consists of items that are interpreted according to the I/O conversion rules for the individual field types, plus decoration that indicates the composite structure. The decoration consists of parentheses (( and )) around the whole value, plus commas (,) between adjacent items. Whitespace outside the parentheses is ignored, but within the parentheses it is considered part of the field value, and might or might not be significant depending on the input conversion rules for the field data type. For example, in:

the whitespace will be ignored if the field type is integer, but not if it is text.

As shown previously, when writing a composite value you can write double quotes around any individual field value. You must do so if the field value would otherwise confuse the composite-value parser. In particular, fields containing parentheses, commas, double quotes, or backslashes must be double-quoted. To put a double quote or backslash in a quoted composite field value, precede it with a backslash. (Also, a pair of double quotes within a double-quoted field value is taken to represent a double quote character, analogously to the rules for single quotes in SQL literal strings.) Alternatively, you can avoid quoting and use backslash-escaping to protect all data characters that would otherwise be taken as composite syntax.

A completely empty field value (no characters at all between the commas or parentheses) represents a NULL. To write a value that is an empty string rather than NULL, write "".

The composite output routine will put double quotes around field values if they are empty strings or contain parentheses, commas, double quotes, backslashes, or white space. (Doing so for white space is not essential, but aids legibility.) Double quotes and backslashes embedded in field values will be doubled.

Note

Remember that what you write in an SQL command will first be interpreted as a string literal, and then as a composite. This doubles the number of backslashes you need (assuming escape string syntax is used). For example, to insert a text field containing a double quote and a backslash in a composite value, you'd need to write:

The string-literal processor removes one level of backslashes, so that what arrives at the composite-value parser looks like ("\"\\"). In turn, the string fed to the text data type's input routine becomes "\. (If we were working with a data type whose input routine also treated backslashes specially, bytea for example, we might need as many as eight backslashes in the command to get one backslash into the stored composite field.) Dollar quoting (see ) can be used to avoid the need to double backslashes.

Tip

The ROW constructor syntax is usually easier to work with than the composite-literal syntax when writing composite values in SQL commands. In ROW, individual field values are written the same way they would be written when not members of a composite.\

8.15. 陣列

PostgreSQL allows columns of a table to be defined as variable-length multidimensional arrays. Arrays of any built-in or user-defined base type, enum type, composite type, range type, or domain can be created.

8.15.1. Declaration of Array Types

To illustrate the use of array types, we create this table:

As shown, an array data type is named by appending square brackets ([]) to the data type name of the array elements. The above command will create a table named sal_emp with a column of type text (name), a one-dimensional array of type integer (pay_by_quarter), which represents the employee's salary by quarter, and a two-dimensional array of text (schedule), which represents the employee's weekly schedule.

The syntax for CREATE TABLE allows the exact size of arrays to be specified, for example:

However, the current implementation ignores any supplied array size limits, i.e., the behavior is the same as for arrays of unspecified length.

The current implementation does not enforce the declared number of dimensions either. Arrays of a particular element type are all considered to be of the same type, regardless of size or number of dimensions. So, declaring the array size or number of dimensions in CREATE TABLE is simply documentation; it does not affect run-time behavior.

An alternative syntax, which conforms to the SQL standard by using the keyword ARRAY, can be used for one-dimensional arrays. pay_by_quarter could have been defined as:

Or, if no array size is to be specified:

As before, however, PostgreSQL does not enforce the size restriction in any case.

8.15.2. Array Value Input

To write an array value as a literal constant, enclose the element values within curly braces and separate them by commas. (If you know C, this is not unlike the C syntax for initializing structures.) You can put double quotes around any element value, and must do so if it contains commas or curly braces. (More details appear below.) Thus, the general format of an array constant is the following:

where delim is the delimiter character for the type, as recorded in its pg_type entry. Among the standard data types provided in the PostgreSQL distribution, all use a comma (,), except for type box which uses a semicolon (;). Each val is either a constant of the array element type, or a subarray. An example of an array constant is:

This constant is a two-dimensional, 3-by-3 array consisting of three subarrays of integers.

To set an element of an array constant to NULL, write NULL for the element value. (Any upper- or lower-case variant of NULL will do.) If you want an actual string value “NULL”, you must put double quotes around it.

(These kinds of array constants are actually only a special case of the generic type constants discussed in . The constant is initially treated as a string and passed to the array input conversion routine. An explicit type specification might be necessary.)

Now we can show some INSERT statements:

The result of the previous two inserts looks like this:

Multidimensional arrays must have matching extents for each dimension. A mismatch causes an error, for example:

The ARRAY constructor syntax can also be used:

Notice that the array elements are ordinary SQL constants or expressions; for instance, string literals are single quoted, instead of double quoted as they would be in an array literal. The ARRAY constructor syntax is discussed in more detail in .

8.15.3. Accessing Arrays

Now, we can run some queries on the table. First, we show how to access a single element of an array. This query retrieves the names of the employees whose pay changed in the second quarter:

The array subscript numbers are written within square brackets. By default PostgreSQL uses a one-based numbering convention for arrays, that is, an array of n elements starts with array[1] and ends with array[n].

This query retrieves the third quarter pay of all employees:

We can also access arbitrary rectangular slices of an array, or subarrays. An array slice is denoted by writing lower-bound:upper-bound for one or more array dimensions. For example, this query retrieves the first item on Bill's schedule for the first two days of the week:

If any dimension is written as a slice, i.e., contains a colon, then all dimensions are treated as slices. Any dimension that has only a single number (no colon) is treated as being from 1 to the number specified. For example, [2] is treated as [1:2], as in this example:

To avoid confusion with the non-slice case, it's best to use slice syntax for all dimensions, e.g., [1:2][1:1], not [2][1:1].

It is possible to omit the lower-bound and/or upper-bound of a slice specifier; the missing bound is replaced by the lower or upper limit of the array's subscripts. For example:

An array subscript expression will return null if either the array itself or any of the subscript expressions are null. Also, null is returned if a subscript is outside the array bounds (this case does not raise an error). For example, if schedule currently has the dimensions [1:3][1:2] then referencing schedule[3][3] yields NULL. Similarly, an array reference with the wrong number of subscripts yields a null rather than an error.

An array slice expression likewise yields null if the array itself or any of the subscript expressions are null. However, in other cases such as selecting an array slice that is completely outside the current array bounds, a slice expression yields an empty (zero-dimensional) array instead of null. (This does not match non-slice behavior and is done for historical reasons.) If the requested slice partially overlaps the array bounds, then it is silently reduced to just the overlapping region instead of returning null.

The current dimensions of any array value can be retrieved with the array_dims function:

array_dims produces a text result, which is convenient for people to read but perhaps inconvenient for programs. Dimensions can also be retrieved with array_upper and array_lower, which return the upper and lower bound of a specified array dimension, respectively:

array_length will return the length of a specified array dimension:

cardinality returns the total number of elements in an array across all dimensions. It is effectively the number of rows a call to unnest would yield:

8.15.4. Modifying Arrays

An array value can be replaced completely:

or using the ARRAY expression syntax:

An array can also be updated at a single element:

or updated in a slice:

The slice syntaxes with omitted lower-bound and/or upper-bound can be used too, but only when updating an array value that is not NULL or zero-dimensional (otherwise, there is no existing subscript limit to substitute).

A stored array value can be enlarged by assigning to elements not already present. Any positions between those previously present and the newly assigned elements will be filled with nulls. For example, if array myarray currently has 4 elements, it will have six elements after an update that assigns to myarray[6]; myarray[5] will contain null. Currently, enlargement in this fashion is only allowed for one-dimensional arrays, not multidimensional arrays.

Subscripted assignment allows creation of arrays that do not use one-based subscripts. For example one might assign to myarray[-2:7] to create an array with subscript values from -2 to 7.

New array values can also be constructed using the concatenation operator, ||:

The concatenation operator allows a single element to be pushed onto the beginning or end of a one-dimensional array. It also accepts two N-dimensional arrays, or an N-dimensional and an N+1-dimensional array.

When a single element is pushed onto either the beginning or end of a one-dimensional array, the result is an array with the same lower bound subscript as the array operand. For example:

When two arrays with an equal number of dimensions are concatenated, the result retains the lower bound subscript of the left-hand operand's outer dimension. The result is an array comprising every element of the left-hand operand followed by every element of the right-hand operand. For example:

When an N-dimensional array is pushed onto the beginning or end of an N+1-dimensional array, the result is analogous to the element-array case above. Each N-dimensional sub-array is essentially an element of the N+1-dimensional array's outer dimension. For example:

An array can also be constructed by using the functions array_prepend, array_append, or array_cat. The first two only support one-dimensional arrays, but array_cat supports multidimensional arrays. Some examples:

In simple cases, the concatenation operator discussed above is preferred over direct use of these functions. However, because the concatenation operator is overloaded to serve all three cases, there are situations where use of one of the functions is helpful to avoid ambiguity. For example consider:

In the examples above, the parser sees an integer array on one side of the concatenation operator, and a constant of undetermined type on the other. The heuristic it uses to resolve the constant's type is to assume it's of the same type as the operator's other input — in this case, integer array. So the concatenation operator is presumed to represent array_cat, not array_append. When that's the wrong choice, it could be fixed by casting the constant to the array's element type; but explicit use of array_append might be a preferable solution.

8.15.5. Searching in Arrays

To search for a value in an array, each value must be checked. This can be done manually, if you know the size of the array. For example:

However, this quickly becomes tedious for large arrays, and is not helpful if the size of the array is unknown. An alternative method is described in . The above query could be replaced by:

In addition, you can find rows where the array has all values equal to 10000 with:

Alternatively, the generate_subscripts function can be used. For example:

This function is described in .

You can also search an array using the && operator, which checks whether the left operand overlaps with the right operand. For instance:

This and other array operators are further described in . It can be accelerated by an appropriate index, as described in .

You can also search for specific values in an array using the array_position and array_positions functions. The former returns the subscript of the first occurrence of a value in an array; the latter returns an array with the subscripts of all occurrences of the value in the array. For example:

Tip

Arrays are not sets; searching for specific array elements can be a sign of database misdesign. Consider using a separate table with a row for each item that would be an array element. This will be easier to search, and is likely to scale better for a large number of elements.

8.15.6. Array Input and Output Syntax

The external text representation of an array value consists of items that are interpreted according to the I/O conversion rules for the array's element type, plus decoration that indicates the array structure. The decoration consists of curly braces ({ and }) around the array value plus delimiter characters between adjacent items. The delimiter character is usually a comma (,) but can be something else: it is determined by the typdelim setting for the array's element type. Among the standard data types provided in the PostgreSQL distribution, all use a comma, except for type box, which uses a semicolon (;). In a multidimensional array, each dimension (row, plane, cube, etc.) gets its own level of curly braces, and delimiters must be written between adjacent curly-braced entities of the same level.

The array output routine will put double quotes around element values if they are empty strings, contain curly braces, delimiter characters, double quotes, backslashes, or white space, or match the word NULL. Double quotes and backslashes embedded in element values will be backslash-escaped. For numeric data types it is safe to assume that double quotes will never appear, but for textual data types one should be prepared to cope with either the presence or absence of quotes.

By default, the lower bound index value of an array's dimensions is set to one. To represent arrays with other lower bounds, the array subscript ranges can be specified explicitly before writing the array contents. This decoration consists of square brackets ([]) around each array dimension's lower and upper bounds, with a colon (:) delimiter character in between. The array dimension decoration is followed by an equal sign (=). For example:

The array output routine will include explicit dimensions in its result only when there are one or more lower bounds different from one.

If the value written for an element is NULL (in any case variant), the element is taken to be NULL. The presence of any quotes or backslashes disables this and allows the literal string value “NULL” to be entered. Also, for backward compatibility with pre-8.2 versions of PostgreSQL, the configuration parameter can be turned off to suppress recognition of NULL as a NULL.

As shown previously, when writing an array value you can use double quotes around any individual array element. You must do so if the element value would otherwise confuse the array-value parser. For example, elements containing curly braces, commas (or the data type's delimiter character), double quotes, backslashes, or leading or trailing whitespace must be double-quoted. Empty strings and strings matching the word NULL must be quoted, too. To put a double quote or backslash in a quoted array element value, precede it with a backslash. Alternatively, you can avoid quotes and use backslash-escaping to protect all data characters that would otherwise be taken as array syntax.

You can add whitespace before a left brace or after a right brace. You can also add whitespace before or after any individual item string. In all of these cases the whitespace will be ignored. However, whitespace within double-quoted elements, or surrounded on both sides by non-whitespace characters of an element, is not ignored.

Tip

The ARRAY constructor syntax (see ) is often easier to work with than the array-literal syntax when writing array values in SQL commands. In ARRAY, individual element values are written the same way they would be written when not members of an array.\

4.1. 語法結構

SQL 語法包含一連串的命令，命令是由一系列的指示記號所組合而成，以分號結尾。最後如果是串流輸入，也會結束一個命令。指示的合法性是由特別的命令語法所定義的。

指示記號可能是關鍵字、識別項、引號識別項、文字、或一個特別的字元符號。指示一般來說是以空白分隔（空白符號、定位符號、換行符號），但如果不會混淆的話，也不一定需要。（一般只出現在特殊字元用來調整了其他指示的型別）

舉個例子，下面就是一個合法（符合語法）的 SQL 輸入：

這個序列包含了 3 個命令，每行一個（然而這不是一定的，同一行可以超過一個命令，而一個命令也可以分解為多行使用）。

順帶一提的是，註解也是 SQL 輸入的一部份，但不屬於任何指示記號，他們等同於空白字元。

SQL 語法並不是很嚴格要求什麼樣的指示記號來識別命令，或是哪些是運算子或參數。通常最前面的指示記號是命令的名稱，以上面的例子來說，我們通常會說是一個「SELECT」、一個「UPDATE」、以及一個「INSERT」命令。但對於 UPDATE 命令而言，有一個 SET 指示記號出現在某個地方是必要的；同樣地，INSERT 也需要有 VALUES 來搭配。精確的語法規則都在第 6 部份中的章節進行說明。

4.1.1. 識別項（Identifier）和關鍵字（Keyword）

在上面的例子中的 SELECT、UPDATE、或是 VALUES，都是屬於關鍵字的範圍。所謂關鍵字，意即在 SQL 語言中，其具有固定的意義。像指示記號 MY_TABLE 則是屬於識別項。它識別表格的名稱，欄位名稱，或是其他的資料庫物件，端看命令如何看待該識別項。然而，有時候它們會被簡稱為「名稱」。關鍵字和識別項的文法結構是相同的，意即不看整個命令的話，是無法辨別到底是識別項還是關鍵字的。完整的關鍵字列表，收錄在附件 C 當中。

SQL 識別項與關鍵字必須以英文字母開頭（a - z，也可以是附加符號和非拉丁字母，中文沒問題）或是底線（_）。剩餘的字元可以是字母、底線、數字（0 - 9）、或錢字號（$）。注意錢字號，在標準 SQL 語法中是不允許使用的，所以可能會降低一些應用程式的可攜性。標準 SQL 也沒有定義包含數字或是以底線起迄的關鍵字，所以識別項這樣的形式定義是安全的，不會和標準未來的修訂相衝突。

資料庫系統不能使用長度超過 NAMEDATALEN -1 的識別項；太長的名稱仍然可以在命令中被輸入，但會被截斷。預設上，NAMEDATALEN 的設定是 64，所以最長的識別項名稱長度是 63 位元組。如果這個限制會造成困擾的話，你也可以調整 NAMEDATALEN 的編譯值，它的設定在 src/include/pg_config_manual.h 檔案中。

關鍵字和無引號識別項都是不分大小寫的，所以：

等同於：

有一種寫法很常使用，就是把關鍵字用大寫表示，而識別項名稱使用小寫，例如：

第二種要介紹的識別項是，受限制的識別項，或是引號識別項。它的形式就是以雙引號括住的任何字串。受限制的識別項，就一定是識別項，不會是關鍵字。所以，「"select"」就會被識別為名稱為「select」的表格或欄位，而無引號的 select 就會被視為是關鍵字，也可能會產生解譯錯誤，如果剛好用在可能是表格或欄位名稱的位置上的話。使用引號識別項的例子如下：

引號識別項可以包含任何字元，除了字元碼為 0 的字元以外。（要包含雙引號字元的話，請使用連續兩個雙引號。）這可以用來建立原來不能使用的表格或欄位名稱，甚至是包含空白或＂&＂。但長度的限制仍然要遵守。

還有一種變形的引號識別項，允許包含跳脫的形式來表現萬國碼（unicode）。這種變形會以「U&」開頭（U大小寫皆可）緊接在前面的雙引號的前面，不能有任何空白在它們之間，例如：U&"foo"。（注意，這可能會和運算子的 & 產生混淆，但可以在運算子的 & 前後都加上空白來避免這個問題。）在雙引號內，萬國碼字元以跳脫的形式表現，也就是以倒斜線再接 4 位數的 16 進位碼，或倒斜線接一個加號再串一組 6 位數的 16 進位碼。例如，識別項 "data" 可以寫成這樣：

下面是稍微不簡明的例子是，俄文的＂slon＂（大象），以希伯萊文字母表現：

如果希望以不同的跳脫字元來代替倒斜線的話，那麼可以雙引號結束後使用 UESCAPE 子句來指定，舉例來說：

跳脫字元可以是任何的單一字元，除了 16 進位數字的字元、單引號、雙引號、或空白以外。注意指定的跳脫字元是以單引號括住，而不是雙引號。

內容要使用到跳脫字元的話，就重覆輸入 2 次。

萬國碼的跳脫語法，只能使用 UTF8 的編碼。如果有用到其他的編碼的話，只有在 ASCII 範圍（最大為 \007F）可以使用。4 位數及 6 位數的形式，可以組合配對用來指定 UTF-16 中，大於 U+FFFF 的字元，雖然 6 位數的形式單獨就可以解決這個問題（組合配對並不會直接被儲存起來，他們會被編碼成 UTF-8 再儲存。）

把識別項用引號括起來也可以用來保持它的大小寫狀態，沒有括起來的話，都會被轉成小寫字母。舉例來說，對 PostgreSQL 而言，FOO、foo、"foo"，三者都是一樣的，但 "Foo" 和 "FOO" 就彼此及前面三者都視為不同。（在 PostgreSQL 中，把未引號括起的名稱轉成小寫，並不是 SQL 的標準。SQL 標準反而是都轉成大寫。所以在 SQL 標準中，foo 應該是等同於 "FOO" 而不同於 "foo"。如果你要增加語法的可攜性的話，建議最好都使用引號括起特別的名稱，或者都不要使用引號。）

4.1.2. 常數

PostgreSQL 中有三種隱含型別的常數：字串、位元字串、和數值。常數也可以強制型別，有助於更精確的表達，也可以讓系統處理更有效率。接下來就開始進行相關的說明。

4.1.2.1. 字串常數

在 SQL 中，所謂的字串常數，指的是用單引號括住的任意字元串列，例如：'This is a string'。如果在字串常數內需要有單引號的話就使用連續兩個單引號，例如：'Dianne''s horse'。注意這不是雙引號，是兩個單引號。

兩個字串常數如果只用空白及至少一個換行符號所分隔的話，那個它們會被連在一起，和寫成一個字串是一樣的。舉例來說：

等同於：

但如果是這樣：

語法上就不正確了。（這是來自於 SQL 奇怪的常規，PostgreSQL 單純只是遵循。）

4.1.2.2. C 語言樣式的跳脫字串常數

PostgreSQL 也支援跳脫字串常數，這些是 SQL 標準的延伸。跳脫字串常數使用的是字母 E （大小寫皆可），緊接著單引號所組成，例如：E'foo'。（如果字串有超過一行的話，也只要在第一個單引號前有 E 就可以了。）在跳脫字串當中，使用倒斜線開頭，就可以使用 C 語言式的倒斜線跳脫字串，通常是一個倒斜線再接一個字元，對應到一個特殊位元組的值，如 Table 4.1 所示。

Table 4.1. 倒斜線跳腳字串（Backslash Escape Sequence）

任何其他接在倒斜線後面的字元都僅以原樣呈現。而如果要包含一個倒斜線的話，就使用連續兩個倒斜線輸入。同樣地，要包含一個單引號的話，可以使用跳脫字串 \' 輸入，也可以用一般連續兩個單引號的方式輸入。

你需要確保你所使用的 8 進位或 16 進位創建的位元組序列，都是屬於資料庫中合法的字元集。當資料庫編輯是 UTF-8 時，就應該使用萬國碼跳脫寫法，或其他萬國碼的輸入方式，如前 4.1.2.3 中所述。（所謂其他的方式可能是自行組合每一個位元組，但這樣會是相當麻煩的事。）

萬國碼跳脫語法只有在 UTF8 的編碼下才完整支援。當有其他的字元編碼被使用時，就只能使用 ASCII 的範圍（最大值為 \u007F）中的值。4 位數及 6 位數的型式可以用來配對指定 UTF-16 超過 U+FFFF 的字元，即使 6 位數的型式就足以解決這個問題。（當使用配對語法，且字元編碼為 UTF8 時，他們會先被合併成單一字元，然後再編碼成 UTF-8。）

注意

如果設定檔參數設定為 off，PostgreSQL 不論在一般字串還是跳脫字串常數，都會把倒斜線識別為跳脫符號。然而，在 PostgreSQL 9.1 之前，這個參數的預設值為 on，表示只在跳脫字串常數裡，才把倒斜線視為跳脫符號。這樣的模式是更與標準相容的，但可能會破壞默認舊有設定的應用程式，也就是總是把倒斜線視為跳脫符號。在這樣的背景之下，你可以把這個參數設為 off，但更好的是，修改程式不再使用倒斜線跳脫符號。如果你需要使用倒斜線跳脫符號來表示一個特殊字元，請使用 E 開頭的字串常數。

有關，順帶一提的是，還有和兩個參數，也提供調整倒斜線在字串常數中的使用。

字元代碼 0 的字元不能使用在字串常數當中。

4.1.2.3. String Constants with Unicode Escapes

PostgreSQL 也支援其他跳脫字串的語法，可以用來直接輸入任意的萬國碼字元。萬國碼跳脫字串常數是以 U& （U& 或 u& 皆可）開頭，然後緊接著單引號括住的字串，記得中間不能有任何空白，例如：U&'foo'。（注意這可能會混淆到 & 的使用，最好在其他使用 & 作為運算子的指令中，在 & 前後加上空白字元，以避免這個問題。）在括住的內容裡，萬國碼字元可以使用跳脫字元來指定，也就是使用倒斜線再接一組 4 位數的 16 進位值，或者以倒斜線加上加號再接一組 6 位數的 16 進位值。舉個例子，字串 'data' 也可以寫成：

下面是稍微不簡明的例子是，俄文的＂slon＂（大象），以希伯萊文字母表現：

如果希望以不同的跳脫字元來代替倒斜線的話，那麼可以雙引號結束後使用 UESCAPE 子句來指定，舉例來說：

跳脫字元可以是任何的單一字元，除了 16 進位數字的字元、單引號、雙引號、或空白以外。

然而，萬國碼的跳脫字串語法，只有在參數設定為 on 時有效。這是因為這個語法可能會造成 SQL 指令在編譯時的困擾，造成 SQL 隱碼攻擊（SQL injection）或其他安全性的問題。如果這個參數設定為 off，那麼這個語法就會被禁止，並且產生錯誤訊息。

內容要使用到跳脫字元的話，就重覆輸入 2 次。

4.1.2.4. 錢字引號字串常數

標準的語法用於字串常數的設定很方便的，但如果字串裡有很多單引號或倒斜線，可讀性就很低了，因為它們都必須再連續多一個符號輸入。像這樣的例子，要改善可讀性的話，PostgreSQL 提供了另一個方式，稱作「錢字引號」（dollar quoting），來描述字串常數。錢字引號字串常數包含一個錢字號（$），可省略或多個字元所組成的「標籤」，另一個錢字號，組成字川的任何序列文字，再一個錢字號，與起始的錢字引號同樣的標籤，再一個錢字號。舉例來說，這裡有兩個不同使用錢字引號的方式，但都是「Dianne's horse」

注意在錢字引號字串中，單引號的使用就不需要跳脫處理了。實際上，在錢字引號字串中，沒有字元需要跳脫處理：字串內容就原樣輸出。倒斜錢並不特別，就算是錢字號也是，除非它們是引號標籤配對的一部份。

巢狀錢字字串常數是可以的，只要在不同層選擇不同的標籤就好。最常見的用途就是撰寫函數定義。舉例如下：

這裡，「$q$[\t\r\n\v\]$q$」以錢字引號字串輸出就是「[\t\r\n\v\]」，作為 PostgreSQL 的函數內容。但這個字串並不會和外層的 $function$ 配對。對外層的字串而言，它只是被包裏的一部份字元而已。

以錢字符作為標籤（如果有的話）的引號字串和無引號的識別項，遵循相同的規則，除了它無法包含錢字符號以外。標籤是區分大小寫的，所以 $tag$String content$tag$ 是正確的，而 $TAG$String content$tag$ 是不合法的。

錢字引號字串緊接著關鍵字或識別項的話，就必須以空白分隔；否則錢字號的終止符可能會被當作前面識別項的一部份。

錢字引號並不是標準 SQL 的用法，但當撰寫一些複雜字串的時候，會比標準語法更為便利。當字串常數內嵌於另一個常數時，也是很好用的情境，像自訂函數時就時常用到。使用單引號的語法時，前面例子中的每一個倒斜線，需要使用 4 個倒斜線才能表示（原來字串常數時需要雙倒斜線，然後在執行階段時也需要雙倒斜線，一共就是 4 倍）。

4.1.2.5. 位元字串常數（Bit-string Constants）

位元字串常數看起來就像是一般的字串常數，只是將 B（大小寫皆可）放在引號的前面（不能有空白），例如：B'1001'。而在位元字串當中，只能有 0 或 1 的存在。

另一方面，位元字串常數也可以表示一個 16 進位的值，使用的先導字為 X（大小寫皆可），例如：X'1FF'。這個撰寫方式與使用前段方式，以 4 位數 2 進位表示每一個 16 進位位數，是相同的結果。

這兩種位元字串常數的表達方式，都可以在字串中換行，如同一般的字串常數。錢字引號表示方式不能使用在位元字串常數上。

4.1.2.6. 數值常數（Numeric Constants）

數值常數可以以下列語法輸入：

這裡的 digits 指的是 0 到 9 的多位數十進位數字。如果有小數點的話，在小數點之前或之後要有數字。在指數標記 e 之前，也必須要有數字。字串中間不能再有其他字元或空白出現。注意，最前面正負號並不是數值常數的一部份，它是屬於運算子的概念。

下面是一些合法數值常數的例子：

42 3.5 4. .001 5e2 1.925e-3

數值常數如果沒有小數點或指數標記的話，預設就會被假定為整數，32 位元以內的為整數型別（interger），否則就會以 64 位元的大整數型別（bigint）來處理。其次就會宣告為數值型別（numeric）。只要包含小數點或指數標記的數值，都會預設使用數值型別。

預設數值常數的資料型別只是整個型別解析演算法的開端而已。在多數的情況下，各種常數會自動被轉換為最貼近內容的適當型別。不過，如果需要的話，你可以強制指定一個資料型別給該常數。舉例來說，你可以強制以實數型別（real 或 float4）來處理該數值：

實際上，在型別轉換上還有一些特殊的情況，留待後續探討。

4.1.2.7. 其他型別常數

任意型別的常數，可以使用下列的語法來表示：

字串常數的內容會由型別轉換的程序 type 來處理，其結果就會得到該常數的專屬型別。明定型別轉換可以被省略，如果不會混淆的話（舉例來說，要輸入給特定的表格欄位的話，因為已有型別宣告，就不會混淆），那麼就會自動給定型別。

字串常數可以使用一般 SQL 標準寫法，或是錢字引號寫法。

還可以使用函數式的語法來撰寫：

但並非所有的型別都可以使用這個方式，請參閱取得詳細說明。

「::」、CAST()、及函數式語法，也可以用來指定任何表示式在執行中的型別轉換，如同中所描述的。要避免語法上的混淆，「type 'string'」這個語法，只能用在指定簡單的文字常數，另一個限制是，不能用於陣列型別。陣列常數的型別指定，請使用 :: 或 CAST() 的語法。

4.1.3. 運算子（Operators）

一個運算子最長可以是 NAMEDATALEN - 1（預設為 63 個字元），除了以下的字元之外：

- * / <> = ~ ! @ # % ^ & | ` ?

還有一些運算子的限制：

「--」和「/*」都不能出現在運算子裡，因為它們表示註解的開始。
多字元的運算子不能以 + 或 - 結尾，除非名稱裡也包含了下列字元：
~ ! @ # % ^ & | ` ?

舉個例子，@- 可以是合法的運算子，但 *- 就不合法。這個限制是讓 PostgreSQL 解譯 SQL 語法時，可以不需要在不同的標記間使用空白分隔。

當使用非 SQL 標準的運算子時，你通常需要在相隣的運算子間使用空白以免混淆。舉例來說，如果你已經定義了一個左側單元運算子 @，你就不能使用 X*@Y，必須寫成 X* @Y，以確保 PostgreSQL 可以識別為兩個運算子，而不是一個。

4.1.4. 特殊字元

有一些字元並不是字母型態，而具有特殊意義，但並非運算子。詳細的說明請參閱相對應的語法說明。本節僅簡要描述這些特殊字元的使用情境。

錢字號（$）其後接著數字的話，用來表示函數宣告或預備指令的參數編號。其他的用法還有識別項的一部份，或是錢字引號常數。
小括號（( )）一般用來強調表示式並且優先運算。還有某些情況用於表示某些 SQL 指令的部份的必要性。
中括號（[ ]）用於組成陣列的各個元素。詳情請參閱有關於陣列的內容。
逗號（,）用於一般語法上的結構需要，來分隔列表中的單元。

4.1.5. 註解（Comments）

註解是以連續兩個破折號開頭，一直到行結尾的字串。例如：

另外，C 語言的註解語法也可以使用：

這樣的註解，以「/*」開頭，一直持續到對應的「*/」出現才結束。這樣區塊式的註解可以巢狀使用，所以你可以一次註解掉一堆包含註解的指令。這點是 SQL 的標準，和 C 語言的使用不太一樣的地方。

註解會在進一步的語法分析前被消去，也可以方便地以空白字元替代。

4.1.6. 運算優先權（Operator Precedence）

Table 4.2 列出在 PostgreSQL 中，運算子的運算優先權及運算次序。大多數的運算子都是相同的運算優先權，並且是左側運算。這些優先權與次序是撰寫在解譯器的程式當中的。

你有時候需要加上括號，當遇到二元運算子與一元運算子一起出現時。舉個例子：

會被解譯為：

因為解譯器並不知道實際的情況，所以它可能會搞錯。「!」是一個後置運算子，並非中置運算子。在這個例子中，要以想要的方式進行運算的話，你必須要改寫為：

這是為了延展性而需要付出的代價。

Table 4.2. Operator Precedence (highest to lowest)

Operator/Element

Associativity

Description

注意，使用與內建運算子同名的自訂運算子，運算優先權的規則也會以原規則適用，如同上面的樣子。舉例來說，如果你定義了一個「+」的運算子，用於自訂的資料型態，那麼它就會和內建的「+」擁有相同的運算優先權，而與你的運算內容無關。

當某個結構操作的運算子用於 OPERATOR 語法之中時，如下所示：

OPERATOR 建構式被用來為任何運算子，取得如 Table 4.2 中所示的預設運算優先權。不論在 OPERATOR() 中指定什麼運算子，都會回傳 true 的結果。

注意

PostgreSQL 在 9.5 之前的運算優先權有一些不同。比較特別的是，比較運算子「<= >= <>」是和一般其他運算子是相同等級的；「IS」先前的優先權較高；而「NOT BETWEEN」和相關的建構式行為不一致，使得在某些情況下，「NOT」和「BETWEEN」的優先權不同。這些規則的改變是為了與 SQL 標準有更好的相容性，減少因為等價轉換的不一致處理所造成的困擾。大多數的情況，這些改變並不需要使用習慣的改變，也不會產生沒有運算子的錯誤，而且都可以透過增加括號來解決。然而，有一些極端的情況可能會在沒有錯誤的情況改變其運算行為。如果你很關心這些變化，很擔心這些無聲的錯誤，你可以打開參數來測試你的程式，然後檢查是否有警告被記錄下來。

8.14. JSON 型別

JSON 資料型別用於儲存 RFC 7159 中所規範的 JSON（JavaScript Object Notation）資料。此類資料也可以儲存為 text，但是 JSON 資料型別的優點是可以根據 JSON 規則強制讓每個儲存的值必須是有效的值。對於這些資料型別中儲存的資料，還提供了各種特定於 JSON 的函數和運算子。另請參閱第 9.16 節。

PostgreSQL 提供了兩種儲存 JSON 資料的型別：json 和 jsonb。為了對這些資料型別實作有效的查詢機制，PostgreSQL 還提供了 8.14.6 節中所描述的 jsonpath 資料型別。

json 和 jsonb 資料型別接受幾乎相同的內容集合作為輸入。實際主要的差別是效率。json 資料型別儲存與輸入字串完全相同的內容，處理函數必須在每次執行時重新解析；jsonb 資料型別則以分解後的二進位格式儲存，由於增加了轉換成本，因此資料輸入的速度稍慢，但由於後續不需要解析，因此處理速度明顯加快。jsonb 還支援索引處理，這是一個很大的優勢。

因為 json 型別儲存與輸入字串完全相同的內容，所以它將保留標記之間語義上無關的空白以及 JSON 物件中鍵的順序。另外，如果 JSON 內容物件包含相同的鍵不只一次，則所有鍵/值對都會保留。（處理函數會將最後一個值視為可用的值。）相比之下，jsonb 不會保留空白，不會保留物件中鍵的順序，也不會保留物件中重複的鍵。如果在輸入中指定了重複的鍵，則僅保留最後一個值。

通常，大多數應用程序應該將 JSON 資料儲存為 jsonb，除非有非常特殊的需求，例如關於物件中鍵的順序有一些傳統上的假設。

由於 PostgreSQL 每個資料庫只允許一種字元集的編碼。因此，除非資料庫編碼為 UTF8，否則 JSON 型別不可能嚴格符合 JSON 規範。嘗試直接使用資料庫編碼中無法表示的字元會失敗；相反，character 型別則允許使用可以在資料庫編碼中表示但不能以 UTF8 表示的字元。

RFC 7159 允許 JSON 字串包含 \uXXXX 所表示的 Unicode 轉譯序列。在 json 型別的輸入函數中，無論資料庫編碼如何，都允許 Unicode 轉譯，並且僅檢查語法正確性（即，四個十六進位數字跟在 \u 之後）。但是，jsonb 的輸入函數更嚴格：除非資料庫編碼為 UTF8，否則它不允許非 ASCII 字元（U+007F 以上的字元）使用 Unicode 轉譯。jsonb 型別也拒絕 \u0000（因為無法在 PostgreSQL 的 text 型別中表現），並且堅持認為使用 Unicode surrogate pair 對來指定 Unicode Basic Multilingual Plane 之外的字元都是正確的。有效的 Unicode 轉譯會轉換為等效的 ASCII 或 UTF8 字元進行儲存；這包括將 surrogate pair 折疊為單個字元。

第 9.15 節中描述的許多 JSON 處理函數會將 Unicode 轉譯為一般字元，因此，即使輸入型別為 json 而不是 jsonb，它們也會拋出與上述類型相同的錯誤。json 輸入函數不進行這些檢查的事實可能被認為是歷史共業，儘管它確實允許以非 UTF8 資料庫編碼的形式簡單儲存（毋須處理）JSON Unicode 轉譯。通常，如果可以的話，最好避免將 JSON 中的 Unicode 轉譯與非 UTF8 資料庫編碼混在一起。

將字串 JSON 輸入轉換為 jsonb 時，RFC 7159 描述的原始型別將會有效地對應到內建的 PostgreSQL 型別，如 Table 8.23 所示。因此，對於構成有效 jsonb 資料的內容存在一些較小的附加約束條件，這些約束條件既不適用於 json 型別，也不適用於抽象上 JSON，這對應於基礎資料型別可以表示的內容限制。值得注意的是，jsonb 會拒絕 PostgreSQL 數字資料型別範圍之外的數字，而 json 不會。RFC 7159 允許此類實作定義限制。但是，實際上，在其他實作中更容易出現此類問題，因為通常將 JSON 的數字基本型別表示為 IEEE 754 雙精確度浮點數（RFC 7159 明確預期了這一點且允許）。當使用 JSON 作為此類系統的交換格式時，應考慮與 PostgreSQL 最初儲存的資料相比較，可能會有失去數字精確度的風險。

相反，如下表中所示，JSON 基本型別的輸入格式有一些微小的限制，但並不適用於其相應的 PostgreSQL 資料型別。

Table 8.23. JSON Primitive Types and Corresponding PostgreSQL Types

JSON primitive type

PostgreSQL type

Notes

8.14.1. JSON 輸入與輸出語法

JSON 資料型別的輸入/輸出語法被規範在 RFC 7159 之中。

以下是所有有效的 json（或 jsonb）表示式：

如前所述，當輸入 JSON 內容然後在不進行任何其他處理的情況下進行輸出時，json 輸出與輸入相同的內容，而 jsonb 則不會保留與語義無關的細節，像是空格。例如，請注意此處的差別：

值得注意的一個語義無關的細節是，在 jsonb 中，數字將根據基本數字型別的行為進行輸出。實際上，這意味著使用 E 記號輸入的數字將不會以原輸出形式輸出，例如：

但是，jsonb 將保留小數尾巴的數字零，如在本範例中所示，即使它們在語義上無意義（例如，相等運算），也是如此。

有關可用於建構和處理 JSON 內容的內建函數和運算子的列表，請參閱。

8.14.2. 設計 JSON 文件結構

將資料表示為 JSON 可以比傳統的關連資料模型要靈活得多，而傳統的關連資料模型在需求多變的環境中非常引人注目。這兩種方法很可能在同一應用程序中共存和互補。但是，即使對於需要最大靈活性的應用程序，仍然建議 JSON 文件具有某種固定的結構。該結構通常是不具有強制性的（儘管可以宣告強制執行某些業務規則），但是具有可預測的結構可以使撰編查詢變得更加容易，該查詢可以有效地彙總資料表中的一組「文件」（datums）。

JSON 資料儲存在資料表中時，與其他任何資料型別一樣，要遵循相同的一致性控制事項。儘管儲存大型文件是可行的，但請記住，任何更新都會取得整筆資料的 row-level lock。考慮將 JSON 文件限制在可管理的大小以內，以減少更新交易事務之間的鎖定競爭。理想情況下，每個 JSON 文件都應代表一個完整交易單位資料(atomic datum)，業務規則規定不能將該完整交易單位資料進一步細分為可以獨立更新的較小單位資料。

8.14.3. `jsonb` Containment and Existence

測試包容性(containment)是 jsonb 的一項重要功能。json 型別沒有平行處理的工具集。包含性測試一個 jsonb 文件是否在其中包含另一個。除說明以外的部份，這些範例會回傳 true：

一般原則是，包含物件必須在結構和資料內容上與包含的物件相吻合，可能是在從包含的物件中丟棄了一些不吻合的陣列元素或物件鍵/值配對之後。但是請記住，進行包含性檢查時，陣列元素的順序並不重要，並且重複陣列元素僅有一個元素會被視為有效。

作為結構必須吻合的一般原則的特殊例外，陣列可以包含單一基本值：

jsonb 還具有一個 existence 運算子，它是包含性的變體：它測試字串（作為 text 值）是否作為物件鍵或陣列元素出現在 jsonb 值的頂層。這些範例回傳 true，除非另有說明：

當涉及許多鍵或元素時，JSON 物件比陣列更適合用於測試是否包含或存在，因為與陣列不同，JSON 物件在內部進行了最佳化以進行搜尋，因此不需要線性搜尋。

由於 JSON 的包含性是巢狀的，因此適當的查詢可以跳過對子物件的明確選擇。舉例來說，假設我們有一個 doc 欄位，其中包含最上層物件，而大多數物件包含子物件陣列的標籤欄位。該查詢項目，在其中包含“ term”：“ paris”和“ term”：“ food”的子物件出現，而忽略標籤陣列以外的任何鍵：

例如，另一個方式可以完成同一件事

但是這種方法靈活性較差，而且效率通常也較低。

另一方面，JSON 存在性運算子不是巢狀的：它只會在 JSON 內容的最上層查詢指定的鍵或陣列元素。

在第 9.15 節中記錄了各種包含性和存在性的運算子，以及所有其他 JSON 運算子和函數。

8.14.4. `jsonb` Indexing

GIN 索引可用於有效搜尋大量的 jsonb 文件（datums）中出現的鍵或鍵/值配對。有兩種 GIN “operator classes”，提供了不同的效能和靈活性權衡。

jsonb 的預設 GIN 運算子類支援使用最上層鍵存在的運算子 ?，?& 和 ?| 進行查詢。運算子和路徑/值存在性運算子 @>。（有關這些運算子實作的語義的詳細信息，請參見。）使用此運算子類建立索引的範例是：

非預設 GIN 運算子類 jsonb_path_ops 僅支援對 @> 運算子進行索引。使用此運算子類建立索引的範例是：

想像一個資料表的範例，該資料表儲存了從第三方 Web 服務檢索到的 JSON 文件以及已文件化的結構定義。典型的文件是：

我們將這些文件儲存在名為 api 的資料表中，名為 jdoc 的 jsonb 欄位中。如果在此欄位上建立了 GIN 索引，則如下查詢可以使用到該索引：

但是，索引不能用於以下查詢，儘管運算子 ? 是可索引的，但它不會直接套用於索引欄位 jdoc：

儘管如此，透過適當使用表示式索引，上述查詢仍可以使用索引。如果在“tags”鍵中查詢特定項目很常見，則定義這樣的索引可能是值得的：

現在，WHERE 子句 jdoc->'tags' ? 'qui' 將被識別為可索引運算子的應用程序 ? 到索引表示式 jdoc->'tags'。（有關表示式索引的更多資訊，請參閱。）

另外，GIN 索引支援＠＠和＠？運算子，它們處理 jsonpath 的搜尋。

GIN 索引從 jsonpath 中取出以下形式的語句：accessors_chain = const。Accessors chain 可能由 .key，[*] 和 [index] 的 Accessor 所組成_。jsonb_ops 也支持 .*_ 和 .** 的 Accessor。

查詢的另一種方法是利用 containment，例如：

jdoc 欄位上的簡單 GIN 索引可以支援此查詢。但是請注意，這樣的索引將在 jdoc 欄位中儲存每個鍵和值的副本，而上一範例的表示式索引僅儲存在 tag 鍵下所找到的資料。儘管簡單索引方法更加靈活（因為它支援對任何鍵的查詢），但目標表示式索引可能比簡單索引更小且搜尋速度更快。

儘管 jsonb_path_ops 運算子類僅支援使用 @>，@@ 和 @? 運算子的查詢，它比預設的運算子類 jsonb_ops 具有明顯的效能優勢。對於相同資料集，jsonb_path_ops 索引通常也比 jsonb_ops 索引小得多，針對搜尋的專用性更好，尤其是當查詢包含頻繁出現在資料中的鍵時。因此，搜尋性質的操作通常比預設運算子類具有更好的效能。

The technical difference between a jsonb_ops and a jsonb_path_ops GIN index is that the former creates independent index items for each key and value in the data, while the latter creates index items only for each value in the data. Basically, each jsonb_path_ops index item is a hash of the value and the key(s) leading to it; for example to index {"foo": {"bar": "baz"}}, a single index item would be created incorporating all three of foo, bar, and baz into the hash value. Thus a containment query looking for this structure would result in an extremely specific index search; but there is no way at all to find out whether foo appears as a key. On the other hand, a jsonb_ops index would create three index items representing foo, bar

A disadvantage of the jsonb_path_ops approach is that it produces no index entries for JSON structures not containing any values, such as {"a": {}}. If a search for documents containing such a structure is requested, it will require a full-index scan, which is quite slow. jsonb_path_ops is therefore ill-suited for applications that often perform such searches.

jsonb also supports btree and hash indexes. These are usually useful only if it's important to check equality of complete JSON documents. The btree ordering for jsonb datums is seldom of great interest, but for completeness it is:

Objects with equal numbers of pairs are compared in the order:

Note that object keys are compared in their storage order; in particular, since shorter keys are stored before longer keys, this can lead to results that might be unintuitive, such as:

Similarly, arrays with equal numbers of elements are compared in the order:

Primitive JSON values are compared using the same comparison rules as for the underlying PostgreSQL data type. Strings are compared using the default database collation.

8.14.5. `jsonb` Subscripting

The jsonb data type supports array-style subscripting expressions to extract and modify elements. Nested values can be indicated by chaining subscripting expressions, following the same rules as the path argument in the jsonb_set function. If a jsonb value is an array, numeric subscripts start at zero, and negative integers count backwards from the last element of the array. Slice expressions are not supported. The result of a subscripting expression is always of the jsonb data type.

UPDATE statements may use subscripting in the SET clause to modify jsonb values. Subscript paths must be traversable for all affected values insofar as they exist. For instance, the path val['a']['b']['c'] can be traversed all the way to c if every val, val['a'], and val['a']['b'] is an object. If any val['a'] or val['a']['b'] is not defined, it will be created as an empty object and filled as necessary. However, if any val itself or one of the intermediary values is defined as a non-object such as a string, number, or jsonb

An example of subscripting syntax:

jsonb assignment via subscripting handles a few edge cases differently from jsonb_set. When a source jsonb value is NULL, assignment via subscripting will proceed as if it was an empty JSON value of the type (object or array) implied by the subscript key:

If an index is specified for an array containing too few elements, NULL elements will be appended until the index is reachable and the value can be set.

A jsonb value will accept assignments to nonexistent subscript paths as long as the last existing element to be traversed is an object or array, as implied by the corresponding subscript (the element indicated by the last subscript in the path is not traversed and may be anything). Nested array and object structures will be created, and in the former case null-padded, as specified by the subscript path until the assigned value can be placed.

8.14.6. 對應轉換

可以使用其他延伸功能來實作針對不同程序語言的 jsonb 型別轉換。

PL/Perl 的延伸功能名稱為 jsonb_plperl 和 jsonb_plperlu。如果使用它們，則 jsonb 的值將視情況對應轉換為到 Perl 的 array、hash 和 scalar。

PL/Python 的延伸功能名稱為 jsonb_plpython3u。使用的時候，jsonb 值將適當地對應轉換到 Python 的 dictionary，list 和 scalar。

在這些延伸功能中，jsonb_plperl 是「trusted」，也就是說，它可以由對目前資料庫具有 CREATE 權限的非超級使用者自行安裝。其餘的需要超級使用者權限才能安裝。

8.14.7. jsonpath Type

jsonpath 型別實現了 PostgreSQL 中對 SQL/JSON 路徑語法的支援，以有效地查詢 JSON 資料。它提供以二元運算的形式來使用已解析的 SQL/JSON 路徑表示式，此表示式讓路徑引擎從 JSON 資料檢索的項目取出內容，以供 SQL/JSON 查詢函數進一步處理。

SQL / JSON 路徑 predicate 和運算子的語義基本遵循 SQL 標準。同時，為了提供使用 JSON 資料的更自然的方式，SQL/JSON 路徑語法使用了一些 JavaScript 約定：

點（.）用於資料成員存取。
中括號（[ ]）用於陣列存取。
與從 1 開始的一般 SQL 陣列不同，SQL/JSON 陣列是從 0 開始。

SQL/JSON 路徑表示式通常以 SQL 字串文字形式寫在 SQL 查詢中，因此它必須用單引號引起來，並且值中所需的任何單引號都必須加倍（請參閱）。某些形式的路徑表示式需要在其中包含字串文字。這些嵌入的字串文字遵循 JavaScript/ECMAScript 約定：它們必須用雙引號引起來，並且在其中可以使用反斜線轉譯符號來表示，否則很難輸入的字元。特別地，在嵌入式字串文字中寫雙引號的方式是 \"，而寫反斜線本身則必須寫成 \。其他特殊的反斜線序列包括在 JSON 字串中識別的那些：\b，\f，\n，\r，\t，\v 用於各種 ASCII 控制字元，\uNNNN 用於其 4 進位數字代碼標識的 Unicode 字元。反斜線語法還包括 JSON 不允許的兩種情況：\xNN 僅用兩個十六進位數字編寫的字元代碼，而 \u {N ...} 用於用 1 至 6 個十六進位數字編寫的字元代碼。

A path expression consists of a sequence of path elements, which can be any of the following:

Path literals of JSON primitive types: Unicode text, numeric, true, false, or null.
Path variables listed in .
Accessor operators listed in .
jsonpath

For details on using jsonpath expressions with SQL/JSON query functions, see .

Table 8.24. `jsonpath` Variables

Variable

Description

Table 8.25. `jsonpath` Accessors

Accessor Operator

Description

For this purpose, the term “value” includes array elements, though JSON terminology sometimes considers array elements distinct from values within objects.

7.2. 資料表表示式

一個 資料表表示式 計算出一個資料表。資料表表示式包含了一個可以選擇在後方跟隨WHERE、GROUP BY和HAVING子句的FROM子句。普遍的資料表表示式簡單地在磁碟上引用一個資料表，, 即聲稱的基底資料表（base table）, 但更複雜的表示式可被用於以多種形式修改或組合基底資料表。

在資料表表示式中選擇性的WHERE、GROUP BY和HAVING子句指定一個逐次變換執行在FROM子句衍生的資料表上的管道。所有的這些轉換都會產生一個虛擬資料表，該資料表提供了被傳遞到選擇串列的資料列，以計算查詢的輸出資料列。

7.2.1. `FROM`子句

The 從逗號分隔資料表參照串列中給出的一個或多個其他的資料表衍生一個資料表。

一個資料表參照能是一個表格名稱（也許綱要限定的），或一個衍生出的資料表，例如子查詢，JOIN建構或這些的複雜組合。如果多個資料表參照被列在FROM子句中，這些資料表參照則表將被交叉聯接（cross-joined，即形成其資料列的笛卡爾積；請參見下文。）FROM串列的結果是一個中間的虛擬表，該表可以受到WHERE、GROUP BY和HAVING子句的轉換，並且最終是整個資料表表示式的結果。

當一個資料表參照命名一個表格繼承層次結構的父級資料表，資料表參照不只是產生該表格的列，還會產生其所有後代表格的列，除非關鍵字ONLY在表格名稱之前。然而，該參照僅產生出現在已命名資料表中的欄位—子資料表中添加的任何欄位都將被忽略。

可以在表格名稱之後寫入*來明確指定包含後代表格，而不是在表格名稱之前寫入ONLY。因為搜索後代表格現在始終是默認行為，沒有真正的理由再使用此語法。但是，支持它是為了與舊版本的兼容性。

7.2.1.1. 聯接的資料表

聯接的資料表（joined table）是一個根據特定聯接型別的規則從兩個（真實的或被衍生的）其他資料表衍生的資料表。可以使用 Inner、outer、及cross-join 。聯接資料表的一般語法是

所有型別的聯接可以鏈結或嵌套在一起： T1 and T2 中的一個或兩個都可以被聯接資料表。可以在JOIN子句周圍使用括號來控制聯接順序。在沒有括號的情況下，JOIN子句從左到右嵌套。

聯接型別

Cross join

對於從 T1 and T2 的列的每種可能的組合(即笛卡爾積), 聯接的資料表將包含一個由 T1 所有欄其次是 T2 所有欄組成的列。如果資料表分別有 N 列及 M 列，聯接表將具有 N * M 列。

FROM T1CROSS JOIN T2 相當於 FROM T1 INNER JOIN T2 ON TRUE（見下文。）它也等同於 FROM T1, T2。

注意

當出現兩個以上的表時，後者的等價關係並不完全成立，因為JOIN的綁定比逗號更緊密。例如，FROM T1 CROSS JOIN T2 INNER JOIN T3 ON

Qualified joins

單詞 INNER 及 OUTER在所有形式中都是可選的。INNER 是默認值； LEFT、RIGHT及 FULL 表示外部聯接。

在 ON or USING子句中指定 join condition ，或由單詞NATURAL隱式指定。聯接條件決定兩個來源資料表中的哪些列被視為“匹配”，如下面詳細的說明。

限定聯接（qualified joins）的可能型別為：

INNER JOIN

對於T1的每一列 R1 ，聯接表有一列在T2中的每一列中滿足R1的聯接條件。

LEFT OUTER JOIN

首先，執行內部聯接。然後，對於T1中每一列與T2中任何列不滿足聯接條件，聯接列在T2的欄中添加空值。因此，對於T1中的每一列聯接表始終至少具有一列。

RIGHT OUTER JOIN

首先，執行內部聯接。然後，對於T2中每一列與T1中任何列不滿足聯接條件，聯接列在T1的欄中添加空值。這是左聯接的反面：對於T2中的每一列結果表將始終有一列。

FULL OUTER JOIN

首先，執行內部聯接。然後，對於T1中每一列與T2中任何列不滿足聯接條件，聯接列在T2的欄中添加空值。另外，對於T2中每一列與T1中任何列不滿足聯接條件，聯接列在T1的欄中添加空值。

ON子句是最通用種類的聯接條件：它採用與WHERE子句中使用的種類相同的Boolean值表示式。如果 ON表示式評估為真值，來自 T1 和 T2 的一對資料列匹配。

USING 子句是一種簡寫形式，可讓您在特定的情況充分利用，即在聯接兩端使用相同的名稱聯接欄位。它使用逗號分隔的共享欄位名稱串列並形成一個包括每個條件相等性比較的聯接條件。例如，將 T1 和 T2 與 USING (a, b) 進行聯接會產生聯接條件ON T1.a =T2.a ANDT1.b =T2.b。

此外，JOIN USING的輸出抑制多餘的欄：無需打印兩個匹配的欄，因為它們必須具有相等的值。儘管JOIN ON會產生 T1 的所有欄其次是 T2 的所有欄，JOIN USING為每個列出的欄配對（按照列出的順序）產生一個輸出欄，其次是 T1 的所有剩餘欄，其次是 T2 的所有剩餘欄。

最後，NATURAL是USING的簡寫形式：它形成一個由出現在兩個輸入資料表中的所有欄位名稱組成的USING串列。與USING一樣，這些欄在輸出表中僅出現一次。如果沒有共用的欄位名稱，NATURAL JOIN 的行為類似於JOIN ... ON TRUE，產生外積聯接（cross-product join。

注意

USING對於在聯接關係中變更欄位是相當安全的因為只有列出的欄位被合併。NATURAL的風險相當可觀，因為任何綱要（schema）變更為任一導致新的匹配欄位名稱出現的關係，也將會導致聯接合併該新的欄位。

綜合以上所述，假設我們有資料表t1:

和資料表t2:

然後對於各種聯接我們得到以下結果：

以ON指定的聯接條件還可以包含與聯接不直接相關的條件。對於某些查詢這可以證明是有用的但需要小心地深思熟慮。例如：

請注意，將限制放置在WHERE子句中會產生不同的結果：

這是因為限制放在 ON子句會在聯接之前被處理，而限制放在 WHERE子句會在聯接之後被處理。這與內部聯接無關緊要，但對於外部聯接則很重要。

7.2.1.2. 資料表和欄位別名

可以為資料表和復雜資料表參照給定一個臨時名稱來用在其餘查詢中參照衍生的資料表。這稱為 資料表別名（table alias） 。

要創建資料表別名，請編寫

或者是

關鍵字AS是選擇性的。 alias 可以是任何標識符。

資料表別名的典型應用是將短標識符分配給長資料表名稱，以保持連接子句的可讀性。例如：

以當前查詢而言，別名成為表參照的新名稱 —不允許在查詢其他位置中使用原始名稱引用該表。因此，這是無效的：

資料表別名主要是為了表示法的方便，但是在將資料表聯接到自身時必須使用它們，例如：

此外，如果表參照是子查詢，則需要別名（詳見。）

括號被用於解決歧義。在以下範例中，第一條語句將別名b分配給my_table的第二個實例，但是第二條語句將別名分配給聯接結果：

資料表別名的另一種形式為資料表欄位以及資料表本身賦予臨時名稱：

如果指定的欄位別名少於實際表中包含的欄位，則不會重命名剩餘的欄位。此語法對於自聯接或子查詢特別有用。

當別名被應用到JOIN子句的輸出時，別名將原始名稱隱藏在JOIN中。例如：

是有效的SQL，但是：

是無效的；資料表別名a在別名c之外並不可見。

7.2.1.3. 子查詢

子查詢指定衍生資料表必須括號括起來必須為資料表分配別名（如。）例如：

這個例子相當於FROM table1 AS alias_name。當子查詢涉及分組或彙總時會出現更有趣的無法簡化為普通聯接的情況。

子查詢也可以是VALUES串列：

同樣，需要資料表別名。為VALUES串列的欄位分配別名是選擇性的，但這是一種好的實踐。有關更多訊息，請參見。

7.2.1.4. 資料表函數

資料表函數是產生一組資料列的函數，這些列由基本資料型別（標量（scalar）型別）或複合數資料型別（資料表列）組成。在查詢的 FROM 子句中，它們像資料表、檢視表或子查詢一樣使用。資料表函數返回的欄位以資料表欄位、檢視表或子查詢相同的方式可以包含在SELECT、JOIN或WHERE子句中。

資料表函數也可以使用ROWS FROM語法進行組合，以並行欄位返回結果；在這種情況下結果列的數量是最大的函數結果，較小的結果將填充空值來匹配。

如果WITH ORDINALITY子句被指定，一個額外的bigint型別欄位将被添加到函數結果欄位。這個欄位從1開始為函數結果集合的列作編號（這是SQL標準語法UNNEST ... WITH ORDINALITY的概括。）在默認情況下，序數欄位欄位被稱為ordinality，但可以使用AS子句分配不同的欄位名稱給它。

特別的資料表函數UNNEST也許伴隨著任意數量的陣列參數被調用，並且他返回一個對應數量的欄位，就如同分別對每個參數調用UNNEST（）並使用ROWS FROM建構將其組合在一起。

如果沒有指定 table_alias，該函數名稱被用作資料表名稱；在ROWS FROM建構的情況中使用第一個函數的名稱。

如果沒有提供欄位別名，則對於返回一個基礎資料型別的函數，該欄位名稱也與函數名稱相同。對於返回一個複合資料型別的函數，該結果欄位取得該型別個別屬性的名稱。

舉一些範例：

在一些情況中他對定義能根據它們的調用方式返回不同欄位集合的資料表函數很有用。為了要支持這情況，資料表函數可以被宣告為返回偽型別 record。在查詢中使用此種函數時，在查詢本身中必須指定預期的資料列結構，以便讓系統知道如何解析和規劃查詢。這種語法看起來像是：

沒有使用ROWS FROM()語法時，column_definition 串列替換原本能被附加到FROM項目的欄位別名串列；在欄位定義中的名稱充當欄位別名。當使用ROWS FROM()語法時，column_definition 串列能被分別附加到每個成員函數；或者如果只有一個成員函數且沒有WITH ORDINALITY子句，能編寫_column_definition_ 串列來代替ROWS FROM()之後的欄位別名串列。

考慮以下範例:

（的一部分）執行遠端查詢。它宣告返回record，因為它可以用於任何種類的查詢。實際的欄位集合必須被指定在調用的查詢以便讓解析器知道，舉例來說，*應該擴展成什麼。

7.2.1.5. LATERAL子查詢

出現在FROM中的子查詢的前面可以有關鍵字LATERAL。這允許它們參照前面FROM項目提供的欄位。（沒有LATERAL的話，每一個子查詢被個別評估所以不能交叉參照任何其他FROM項目。）

出現在FROM中的資料表函數的前面也能有關鍵字LATERAL，但對於函數來說該關鍵字是選擇性的；在任何情況下該函數的參數能包含前面FROM項目提供的欄位參照。

LATERAL項目能出現在FROM串列的頂層，或在JOIN樹之中。在後面的情況下在JOIN右邊的LATERAL也能引用在JOIN左邊的任何項目。

當FROM項目包含LATERAL交叉參照，評估過程如下：對於該FROM項目每一個提供交叉參照後欄位的列，或是多個FROM項目之提供欄位的列集合，將使用該欄位的列或列集合值來評估LATERAL項目。結果資料列照常與運算出它們的資料列聯接。對於欄位來源表的每一列或列集合重複此操作。

LATERAL的一個簡單範例是：

這不是特別有用，因為它與完全常規的結果完全相同

LATERAL主要有用的時機是在運算資料列聯接而需要交叉參照後欄位的時候。典型的應用是提供一個參數值給會返回集合的函數。舉例來說，假如vertices(polygon)返回多邊形的頂點集合，我們可以經由以下方式識別存儲在表中多邊形的近似頂點：

這個查詢也可以寫成

或者以其他幾種等效公式表示。（如前所述，關鍵字LATERAL在此範例中是不必要的，但為了清楚起見而使用它。）

即使LATERAL子查詢沒有產生資料列，通常特別便利將LEFT JOIN添加到LATERAL子查詢，使得來源資料列將出現在結果中。舉例來說，如果get_product_names()返回製造商生產的產品名稱，但是我們表中的某些製造商目前未生產任何產品，我們可以像這樣找出：

7.2.2. `WHERE`子句

子句的語法是

其中 search_condition 是任何返回型別boolean值的值表示式（參見。）

在完成FROM子句的處理之後，針對搜尋條件檢查衍生虛擬表的每一列。如果條件的結果為true，則資料列保留在輸出表中，否則（即結果為false或null）被丟棄。搜尋條件通常參照在FROM子句中生成的表中的至少一欄；這不是必須的，但反之WHERE 子句是相當毫無用處的。

注意

內部聯接的聯接條件可以寫入在 WHERE子句中或JOIN 子句中。例如，這些資料表表示式等同於：

以及：

或也甚至：

使用其中哪一個主要是風格問題。FROM

以下是WHERE子句的一些範例：

fdt是在 FROM子劇中衍生的資料表。不符合WHERE子句搜尋條件的列從FDT排除。請注意標量（scalar）子查詢作為值表示式的使用。就像任何其他查詢一樣，子查詢可以採用複雜的資料表表示式。還要注意在子查詢中fdt是如何被參照的。僅當c1也是子查詢衍生輸入表中的欄位名稱時，限定（qualifying）c1為fdt.c1是必要的。但即使不需要，限定欄位名稱會增加清晰度。此範例顯示了外部查詢的欄位命名作用域如何延伸到其內部查詢中。

7.2.3. `GROUP BY`及 `HAVING`子句

在經過WHERE篩選器後，衍生的輸入表可能會遭受到使用GROUP BY 子句進行分組，而使用HAVING子句進行群組資料列的排除。

子句用於將資料列分組在一起，這些資料列在條列出的所有資料列中具有相同的值。條列出的的欄位順序無關緊要。其效果是將具有共同值的資料列集合在群組中組合到一個群組資料列來表示所有資料列。這樣做是為了排除輸出中的的冗餘且/或運算應用於這些群組的彙總。例如：

在第二個查詢中，我們不能寫成 SELECT * FROM test1 GROUP BY x，因為對於可能與每個群組相關聯的欄位y來說沒有單一值。可以在選擇串列中參照被分組的列，因為它們在每個群組中具有單一值。

通常來說，如果將資料表被分組，則除了彙總表示式之外不能參照沒有在GROUP BY中條列出的欄位。彙總表示式的範例是：

在這裡sum是一個在整個群組之上運算一個單一值的彙總函數。有關彙總函數的更多訊息，請參見。

Tip

沒有彙總表示式的分組有效地運算一個欄位中的相異值集合。這也可以使用DISTINCT 子句來實現（詳見。）

這是另一個範例，它計算每個產品的總銷售額（而不是所有產品的總銷售）：

在這個範例，欄位product_id、p.name、及p.price必須在GROUP BY子句中是由於它們在查詢選擇串列中被參照（但詳見下文。）欄位s.units沒有需要在GROUP BY串列是由於它只能使用在彙總表示式（sum(...)），其代表一個產品的銷售。對於每個產品，查詢返回關於該產品所有銷售的摘要資料列。

如果產品資料被設置為product_id是主鍵（primary key），然後在上方的範例中它足以經由被product_id 分組，是由於名稱與價格將是在功能上依賴於產品ID，所以對與每個產品ID群組要返回哪些名稱和價格值都沒有模棱兩可。

在嚴格的SQL中， GROUP BY只能經由來源資料表的欄位進行分組但PostgreSQL擴展允許GROUP BY經由選擇串列中的欄位進行分組。允許經由值表示式來取代簡單的欄位名稱進行分組。

如果資料表已經被GROUP BY分組，但只有對某些群組感興趣，能使用HAVING子句，類似WHERE子句，從結果來排除群組。語法如下：

在HAVING子句中的表示式能引用已分組表示式及未分組表示式兩者（其必然涉及彙總函數。）

舉例：

再來一個更真實的範例：

在上方的範例中，WHERE子句正在經由一個未被分組的欄位選擇資料列（在過去四周內，該表示式僅適用於銷售額），儘管 HAVING子句限制輸出為總銷售額超過5000的群組。請注意，彙總表示式在查詢的所有部分中不一定需要相同。

如果查詢包含彙總函數調用但沒有 GROUP BY子句，分組仍然會發生：結果是單個群組資料列（或者可能沒有資料列，如果經由HAVING排除該單一資料列。）即使沒有任何彙總函數調用或 GROUP BY子句，如果包含HAVING子句則同樣會發生。

7.2.4. `GROUPING SETS`、`CUBE`及 `ROLLUP`

更多比上方描述較複雜的分組操作可以使用 分組集合（grouping sets） 的概念。經由FROM及WHERE子句選擇的資料被每一個特定的分組集合分別地分組，對於每一個群組運算的彙總就如同簡單的GROUP BY子句，而後返回其結果。舉例來說：

每一個GROUPING SETS的子串列可以指定零個或多個欄位或表示式並且以它直接在GROUP BY子句中相同的方式來解釋。一個空的分組集合意味著所有資料列被彙總到單一的群組（即使沒有輸入資料列被呈現也會輸出），如同上方所述對於沒有GROUP BY子句的彙總函數之情況。

分組欄位或表示式的參照對於未出現在這些欄位中的分組集合來說會在結果列中由null值替換。要區分源自哪邊的分組特定輸出列，詳見。

為了指定兩個分組集合的常見型別提供了一個簡寫表示法。該形式的子句為

代表了給定的表達式串列和該串列的所有前綴，包括空串列；因此它相當於

這通常用於分析階層式資料：例如，部門，分部和公司的總薪資。

另一形式的子句為

表示給定的串列和所有可能的子集合（即power set。）因此

相當於

CUBE或ROLLUP 子句各自的元素也許是各自的表示式，或元素在括號中的子串列。在後一種情況下，為了生成各自的分組集合的意圖，該子串列被視為單個單元。例如：

相當於

以及

相當於

CUBE或ROLLUP 建構能被直接用在GROUP BY子句中，或被嵌套在GROUPING SETS子句內。如果GROUPING SETS子句被嵌套在另一個內，效果與內部子句內的所有元素被直接寫入外部子句中時相同。

如果多個的分組項目被指定在單一GROUP BY子句，分組集合的最終串列會是各自項目的外積。例如：

相當於

注意

建構 (a, b)一般來說在表示式中被辨識為一個。在GROUP BY子句內，這不適用於表示式的頂層，並且 (a, b)是被解析為一個如上方所述的表示式串列。如果為某些理由你需要一個資料列建構子在分組表示式，請使用ROW(a, b)。

7.2.5. 窗函數處理

如果查詢包含任何窗函數（詳見，，），這些函數在執行任何分組、彙總及HAVING篩選之後被評估。也就是說，如果查詢使用任何彙總、GROUP BY或HAVING，則窗函數看到的資料列是分組資料列而不是來自FROM/WHERE的原始表資料列。

當使用多個窗函數，擁有在語法上等效於PARTITION BY及ORDER BY子句的所有窗函數在窗口定義中是被保證在資料上的單次傳遞中被評估。因此它們將看到相同的排序次序，即使ORDER BY沒有唯一決定次序。然而不保證具有不同於PARTITION BY或ORDER BY規範的函數之評估。（在這種情況下窗函數評估的傳遞之間通常需要排序步驟，並且不保證該排序會維持它的ORDER BY視為等效的資料列之次序。）

目前，窗函數總是必須要預先排序的資料，因此會依照一個或其他窗函數的PARTITION BY/ORDER BY子句整理查詢輸出。然而，不建議依賴這一點。使用顯式頂層ORDER BY子句如果要確保結果以特定方式排序。

9.9 日期時間函式及運算子

Table 9.31 shows the available functions for date/time value processing, with details appearing in the following subsections. Table 9.30 illustrates the behaviors of the basic arithmetic operators (+, *, etc.). For formatting functions, refer to Section 9.8. You should be familiar with the background information on date/time data types from Section 8.5.

All the functions and operators described below that take time or timestamp inputs actually come in two variants: one that takes time with time zone or timestamp with time zone, and one that takes time without time zone or timestamp without time zone. For brevity, these variants are not shown separately. Also, the + and * operators come in commutative pairs (for example both date + integer and integer + date); we show only one of each such pair.

Table 9.30. Date/Time Operators

Operator

Example

Result

Table 9.31. Date/Time Functions

Function

Return Type

Description

Example

Result

In addition to these functions, the SQL OVERLAPS operator is supported:

This expression yields true when two time periods (defined by their endpoints) overlap, false when they do not overlap. The endpoints can be specified as pairs of dates, times, or time stamps; or as a date, time, or time stamp followed by an interval. When a pair of values is provided, either the start or the end can be written first; OVERLAPS automatically takes the earlier value of the pair as the start. Each time period is considered to represent the half-open interval start <= time < end, unless start and end are equal in which case it represents that single time instant. This means for instance that two time periods with only an endpoint in common do not overlap.

When adding an interval value to (or subtracting an interval value from) a timestamp with time zone value, the days component advances or decrements the date of the timestamp with time zone by the indicated number of days, keeping the time of day the same. Across daylight saving time changes (when the session time zone is set to a time zone that recognizes DST), this means interval '1 day' does not necessarily equal interval '24 hours'. For example, with the session time zone set to America/Denver:

This happens because an hour was skipped due to a change in daylight saving time at 2005-04-03 02:00:00 in time zone America/Denver.

Note there can be ambiguity in the months field returned by age because different months have different numbers of days. PostgreSQL's approach uses the month from the earlier of the two dates when calculating partial months. For example, age('2004-06-01', '2004-04-30') uses April to yield 1 mon 1 day, while using May would yield 1 mon 2 days because May has 31 days, while April has only 30.

Subtraction of dates and timestamps can also be complex. One conceptually simple way to perform subtraction is to convert each value to a number of seconds using EXTRACT(EPOCH FROM ...), then subtract the results; this produces the number of seconds between the two values. This will adjust for the number of days in each month, timezone changes, and daylight saving time adjustments. Subtraction of date or timestamp values with the “-” operator returns the number of days (24-hours) and hours/minutes/seconds between the values, making the same adjustments. The age function returns years, months, days, and hours/minutes/seconds, performing field-by-field subtraction and then adjusting for negative field values. The following queries illustrate the differences in these approaches. The sample results were produced with timezone = 'US/Eastern'; there is a daylight saving time change between the two dates used:

9.9.1. `EXTRACT`, `date_part`

The extract function retrieves subfields such as year or hour from date/time values. source must be a value expression of type timestamp, time, or interval. (Expressions of type date are cast to timestamp and can therefore be used as well.) field is an identifier or string that selects what field to extract from the source value. The extract function returns values of type double precision. The following are valid field names:century

The century

The first century starts at 0001-01-01 00:00:00 AD, although they did not know it at the time. This definition applies to all Gregorian calendar countries. There is no century number 0, you go from -1 century to 1 century. If you disagree with this, please write your complaint to: Pope, Cathedral Saint-Peter of Roma, Vatican.day

For timestamp values, the day (of the month) field (1 - 31) ; for interval values, the number of days

decade

The year field divided by 10

dow

The day of the week as Sunday (0) to Saturday (6)

Note that extract's day of the week numbering differs from that of the to_char(..., 'D') function.doy

The day of the year (1 - 365/366)

epoch

For timestamp with time zone values, the number of seconds since 1970-01-01 00:00:00 UTC (can be negative); for date and timestamp values, the number of seconds since 1970-01-01 00:00:00 local time; for interval values, the total number of seconds in the interval

You can convert an epoch value back to a time stamp with to_timestamp:

hour

The hour field (0 - 23)

isodow

The day of the week as Monday (1) to Sunday (7)

This is identical to dow except for Sunday. This matches the ISO 8601 day of the week numbering.isoyear

The ISO 8601 week-numbering year that the date falls in (not applicable to intervals)

Each ISO 8601 week-numbering year begins with the Monday of the week containing the 4th of January, so in early January or late December the ISO year may be different from the Gregorian year. See the week field for more information.

This field is not available in PostgreSQL releases prior to 8.3.microseconds

The seconds field, including fractional parts, multiplied by 1 000 000; note that this includes full seconds

millennium

The millennium

Years in the 1900s are in the second millennium. The third millennium started January 1, 2001.milliseconds

The seconds field, including fractional parts, multiplied by 1000. Note that this includes full seconds.

minute

The minutes field (0 - 59)

month

For timestamp values, the number of the month within the year (1 - 12) ; for interval values, the number of months, modulo 12 (0 - 11)

quarter

The quarter of the year (1 - 4) that the date is in

second

The seconds field, including fractional parts (0 - 59)

timezone

The time zone offset from UTC, measured in seconds. Positive values correspond to time zones east of UTC, negative values to zones west of UTC. (Technically, PostgreSQL does not use UTC because leap seconds are not handled.)timezone_hour

The hour component of the time zone offsettimezone_minute

The minute component of the time zone offsetweek

The number of the ISO 8601 week-numbering week of the year. By definition, ISO weeks start on Mondays and the first week of a year contains January 4 of that year. In other words, the first Thursday of a year is in week 1 of that year.

In the ISO week-numbering system, it is possible for early-January dates to be part of the 52nd or 53rd week of the previous year, and for late-December dates to be part of the first week of the next year. For example, 2005-01-01 is part of the 53rd week of year 2004, and 2006-01-01 is part of the 52nd week of year 2005, while 2012-12-31 is part of the first week of 2013. It's recommended to use the isoyear field together with week to get consistent results.

year

The year field. Keep in mind there is no 0 AD, so subtracting BC years from AD years should be done with care.

Note

When the input value is +/-Infinity, extract returns +/-Infinity for monotonically-increasing fields (epoch, julian, year, isoyear, decade, century, and millennium). For other fields, NULL is returned. PostgreSQL versions before 9.6 returned zero for all cases of infinite input.

The extract function is primarily intended for computational processing. For formatting date/time values for display, see .

The date_part function is modeled on the traditional Ingres equivalent to the SQL-standard function extract:

Note that here the field parameter needs to be a string value, not a name. The valid field names for date_part are the same as for extract.

9.9.2. `date_trunc`

The function date_trunc is conceptually similar to the trunc function for numbers.

source is a value expression of type timestamp, timestamp with time zone, or interval. (Values of type date and time are cast automatically to timestamp or interval, respectively.) field selects to which precision to truncate the input value. The return value is likewise of type timestamp, timestamp with time zone, or interval, and it has all fields that are less significant than the selected one set to zero (or one, for day and month).

Valid values for field are:

When the input value is of type timestamp with time zone, the truncation is performed with respect to a particular time zone; for example, truncation to day produces a value that is midnight in that zone. By default, truncation is done with respect to the current setting, but the optional time_zone argument can be provided to specify a different time zone. The time zone name can be specified in any of the ways described in .

A time zone cannot be specified when processing timestamp without time zone or interval inputs. These are always taken at face value.

Examples (assuming the local time zone is America/New_York):

9.9.3. `AT TIME ZONE`

The AT TIME ZONE converts time stamp without time zone to/from time stamp with time zone, and time values to different time zones. shows its variants.

Table 9.32. `AT TIME ZONE` Variants

Expression

Return Type

Description

In these expressions, the desired time zone zone can be specified either as a text string (e.g., 'America/Los_Angeles') or as an interval (e.g., INTERVAL '-08:00'). In the text case, a time zone name can be specified in any of the ways described in .

Examples (assuming the local time zone is America/Los_Angeles):

The first example adds a time zone to a value that lacks it, and displays the value using the current TimeZone setting. The second example shifts the time stamp with time zone value to the specified time zone, and returns the value without a time zone. This allows storage and display of values different from the current TimeZone setting. The third example converts Tokyo time to Chicago time. Converting time values to other time zones uses the currently active time zone rules since no date is supplied.

The function timezone(zone, timestamp) is equivalent to the SQL-conforming construct timestamp AT TIME ZONE zone.

9.9.4. Current Date/Time

PostgreSQL provides a number of functions that return values related to the current date and time. These SQL-standard functions all return values based on the start time of the current transaction:

CURRENT_TIME and CURRENT_TIMESTAMP deliver values with time zone; LOCALTIME and LOCALTIMESTAMP deliver values without time zone.

CURRENT_TIME, CURRENT_TIMESTAMP, LOCALTIME, and LOCALTIMESTAMP can optionally take a precision parameter, which causes the result to be rounded to that many fractional digits in the seconds field. Without a precision parameter, the result is given to the full available precision.

Some examples:

Since these functions return the start time of the current transaction, their values do not change during the transaction. This is considered a feature: the intent is to allow a single transaction to have a consistent notion of the “current” time, so that multiple modifications within the same transaction bear the same time stamp.

Note

Other database systems might advance these values more frequently.

PostgreSQL also provides functions that return the start time of the current statement, as well as the actual current time at the instant the function is called. The complete list of non-SQL-standard time functions is:

transaction_timestamp() is equivalent to CURRENT_TIMESTAMP, but is named to clearly reflect what it returns. statement_timestamp() returns the start time of the current statement (more specifically, the time of receipt of the latest command message from the client). statement_timestamp() and transaction_timestamp() return the same value during the first command of a transaction, but might differ during subsequent commands. clock_timestamp() returns the actual current time, and therefore its value changes even within a single SQL command. timeofday() is a historical PostgreSQL function. Like clock_timestamp(), it returns the actual current time, but as a formatted text string rather than a timestamp with time zone value. now()

All the date/time data types also accept the special literal value now to specify the current date and time (again, interpreted as the transaction start time). Thus, the following three all return the same result:

Tip

You do not want to use the third form when specifying a DEFAULT clause while creating a table. The system will convert now to a timestamp as soon as the constant is parsed, so that when the default value is needed, the time of the table creation would be used! The first two forms will not be evaluated until the default value is used, because they are function calls. Thus they will give the desired behavior of defaulting to the time of row insertion.

9.9.5. Delaying Execution

The following functions are available to delay execution of the server process:

pg_sleep makes the current session's process sleep until seconds seconds have elapsed. seconds is a value of type double precision, so fractional-second delays can be specified. pg_sleep_for is a convenience function for larger sleep times specified as an interval. pg_sleep_until is a convenience function for when a specific wake-up time is desired. For example:

Note

The effective resolution of the sleep interval is platform-specific; 0.01 seconds is a common value. The sleep delay will be at least as long as specified. It might be longer depending on factors such as server load. In particular, pg_sleep_until is not guaranteed to wake up exactly at the specified time, but it will not wake up any earlier.

Warning

Make sure that your session does not hold more locks than necessary when calling pg_sleep or its variants. Otherwise other sessions might have to wait for your sleeping process, slowing down the entire system.\

60 if leap seconds are implemented by the operating system