PostgreSQL 正體中文使用手冊
PostgreSQL.TW官方使用手冊小島故事加入社團
11
11
  • 簡介
  • 前言
    • 1. 什麼是PostgreSQL?
    • 2. PostgreSQL沿革
    • 3. 慣例
    • 4. 其他參考資訊
    • 5. 問題回報指南
  • I. 新手教學
    • 1. 入門指南
      • 1.1. 安裝
      • 1.2. 基礎架構
      • 1.3. 建立一個資料庫
      • 1.4. 存取一個資料庫
    • 2. SQL查詢語言
      • 2.1. 簡介
      • 2.2. 概念
      • 2.3. 創建一個新的資料表
      • 2.4. 資料列是資料表的組成單位
      • 2.5. 資料表的查詢
      • 2.6. 交叉查詢
      • 2.7. 彙總查詢
      • 2.8. 更新資料
      • 2.9. 刪除資料
    • 3. 先進功能
      • 3.1. 簡介
      • 3.2. 檢視表(View)
      • 3.3. 外部索引鍵
      • 3.4. 交易安全
      • 3.5. 窗函數
      • 3.6. 繼承
      • 3.7. 結論
  • II. SQL查詢語言
    • 4. SQL語法
      • 4.1. 語法結構
      • 4.2. 參數表示式
      • 4.3. 函數呼叫
    • 5. 定義資料結構
      • 5.1. 認識資料表
      • 5.2. 預設值
      • 5.3. 限制條件
      • 5.4. 系統欄位
      • 5.5. 表格變更
      • 5.6. 權限
      • 5.7. 資料列安全原則
      • 5.8. Schemas
      • 5.9. 繼承
      • 5.10. 分割資料表
      • 5.11. 外部資料
      • 5.12. 其他資料庫物件
      • 5.13. 相依性追蹤
    • 6. 資料處理
      • 6.1. 新增資料
      • 6.2. 更新資料
      • 6.3. 刪除資料
      • 6.4. 修改並回傳資料
    • 7. 資料查詢
      • 7.1. 概觀
      • 7.2. 資料表表示式
      • 7.3. 取得資料列表
      • 7.4. 合併查詢結果
      • 7.5. 資料排序
      • 7.6. 指定資料範圍
      • 7.7. 列舉資料
      • 7.8. 遞迴查詢(Common Table Expressions)
    • 8. 資料型別
      • 8.1. 數字型別
      • 8.2. 貨幣型別
      • 8.3. 字串型別
      • 8.4. 位元組型別(bytea)
      • 8.5. 日期時間型別
      • 8.6. 布林型別
      • 8.7. 列舉型別
      • 8.8. 地理資訊型別
      • 8.9. 網路資訊型別
      • 8.10. 位元字串型別
      • 8.11. 全文檢索型別
      • 8.12. UUID型別
      • 8.13. XML型別
      • 8.14. JSON型別
      • 8.15. 陣列
      • 8.16. 複合型別
      • 8.17. 範圍型別
      • 8.18. 指標型別
      • 8.19. pg_lsn型別
      • 8.20. 概念型別
    • 9. 函式及運算子
      • 9.1. 邏輯運算子
      • 9.2. 比較函式及運算子
      • 9.3. 數學函式及運算子
      • 9.4. 字串函式及運算子
      • 9.5. 位元字串函式及運算子
      • 9.6. 二元字串函式及運算子
      • 9.7. 特徵比對
      • 9.8. 型別轉換函式
      • 9.9 日期時間函式及運算子
      • 9.10. 列舉型別函式
      • 9.11. 地理資訊函式及運算子
      • 9.12. 網路位址函式及運算子
      • 9.13. 文字檢索函式及運算子
      • 9.14. XML函式
      • 9.15. JSON函式及運算子
      • 9.16. 序列函式
      • 9.17. 條件表示式
      • 9.18. 陣列函式及運算子
      • 9.19. 範圍函式及運算子
      • 9.20. 彙總函數
      • 9.21. Window函式
      • 9.22. 子查詢
      • 9.23. 資料列與陣列的比較運算
      • 9.24. 集合回傳函式
      • 9.25. 系統資訊函數
      • 9.26. 系統管理函式
      • 9.27. 觸發函式
      • 9.28. 事件觸發函式
    • 10. 型別轉換
      • 10.1. 概觀
      • 10.2. 運算子
      • 10.3. 函式
      • 10.4. 資料儲存轉換規則
      • 10.5. UNION、CASE 等相關結構
      • 10.6. SELECT輸出規則
    • 11. 索引(Index)
      • 11.1. 簡介
      • 11.2. 索引型別
      • 11.3. 多欄位索引
      • 11.4. 索引與ORDER BY
      • 11.5. 善用多個索引
      • 11.6. 唯一值索引
      • 11.7. 表示式索引
      • 11.8. 部份索引(partial index)
      • 11.9. 運算子物件及家族
      • 11.10. 索引與排序規則
      • 11.11. 索引限定查詢(Index-only scan)
      • 11.12. 檢查索引運用
    • 12. 全文檢索
      • 12.1. 簡介
      • 12.2. 查詢與索引
      • 12.3. 細部控制
      • 12.4. 延伸功能
      • 12.5. 斷詞
      • 12.6. 字典
      • 12.7. 組態範例
      • 12.8. 測試與除錯
      • 12.9. GIN及GiST索引型別
      • 12.10. psql支援
      • 12.11. 功能限制
    • 13. 一致性管理(MVCC)
      • 13.1. 簡介
      • 13.2. 交易隔離
      • 13.3. 鎖定模式
      • 13.4. 在應用端檢視資料一致性
      • 13.5. 特別注意
      • 13.6. 鎖定與索引
    • 14. 效能技巧
      • 14.1. 善用EXPLAIN
      • 14.2. 統計資訊
      • 14.3. 使用確切的JOIN方式
      • 14.4. 快速建立資料庫內容
      • 14.5. 彈性設定
    • 15. 平行查詢
      • 15.1. 如何運作?
      • 15.2. 啓用時機?
      • 15.3. 平行查詢計畫
      • 15.4. 平行查詢的安全性
  • III. 系統管理
    • 16. 用原始碼安裝
      • 16.1. Short Version
      • 16.2. Requirements
      • 16.3. Getting The Source
      • 16.4. 安裝流程
      • 16.5. Post-Installation Setup
      • 16.6. Supported Platforms
      • 16.7. 平台相關的注意事項
    • 17. 用原始碼在 Windows 上安裝
      • 17.1. Building with Visual C++ or the Microsoft Windows SDK
    • 18. 服務配置與維運
      • 18.1. PostgreSQL 使用者帳號
      • 18.2. Creating a Database Cluster
      • 18.3. Starting the Database Server
      • 18.4. 核心資源管理
      • 18.5. Shutting Down the Server
      • 18.6. Upgrading a PostgreSQL Cluster
      • 18.7. Preventing Server Spoofing
      • 18.8. Encryption Options
      • 18.9. Secure TCP/IP Connections with SSL
      • 18.10. Secure TCP/IP Connections with SSH Tunnels
      • 18.11. 在 Windows 註冊事件日誌
    • 19. 服務組態設定
      • 19.1. Setting Parameters
      • 19.2. File Locations
      • 19.3. 連線與認證
      • 19.4. 資源配置
      • 19.5. Write Ahead Log
      • 19.6. 複寫(Replication)
      • 19.7. 查詢規畫
      • 19.8. 錯誤回報與日誌記錄
      • 19.9. Run-time Statistics
      • 19.10. 自動資料庫清理
      • 19.11. 用戶端連線預設參數
      • 19.12. 交易鎖定管理
      • 19.13. 版本與平台的相容性
      • 19.14. Error Handling
      • 19.15. 預先配置的參數
      • 19.16. Customized Options
      • 19.17. Developer Options
      • 19.18. Short Options
    • 20. 使用者認證
      • 20.1. 設定檔:pg_hba.conf
      • 20.2. User Name Maps
      • 20.3. Authentication Methods
      • 20.4. Authentication Problems
    • 21. 資料庫角色
      • 21.1. Database Roles
      • 21.2. Role Attributes
      • 21.3. Role Membership
      • 21.4. 移除角色
      • 21.5. Default Roles
      • 21.6. Function Security
    • 22. Managing Databases
      • 22.1. Overview
      • 22.2. Creating a Database
      • 22.3. 樣版資料庫
      • 22.4. Database Configuration
      • 22.5. Destroying a Database
      • 22.6. Tablespaces
    • 23. 語系
      • 23.1. 語系支援
      • 23.2. Collation Support
      • 23.3. 字元集支援
    • 24. 例行性資料庫維護工作
      • 24.1. 例行性資料清理
      • 24.2. 定期重建索引
      • 24.3. Log File Maintenance
    • 25. 備份及還原
      • 25.1. SQL Dump
      • 25.2. File System Level Backup
      • 25.3. Continuous Archiving and Point-in-Time Recovery (PITR)
    • 26. High Availability, Load Balancing, and Replication
      • 26.1. Comparison of Different Solutions
      • 26.2. 日誌轉送備用伺服器 Log-Shipping Standby Servers
      • 26.3. Failover
      • 26.4. Alternative Method for Log Shipping
      • 26.5. Hot Standby
    • 27. Recovery Configuration
      • 27.1. Archive Recovery Settings
      • 27.2. Recovery Target Settings
      • 27.3. Standby Server Settings
    • 28. 監控資料庫活動
      • 28.1. Standard Unix Tools
      • 28.2. 統計資訊收集器
      • 28.3. Viewing Locks
      • 28.4. Progress Reporting
      • 28.5. Dynamic Tracing
    • 29. Monitoring Disk Usage
      • 29.1. Determining Disk Usage
      • 29.2. Disk Full Failure
    • 30. 高可靠度及預寫日誌
      • 30.1. Reliability
      • 30.2. Write-Ahead Logging (WAL)
      • 30.3. Asynchronous Commit
      • 30.4. WAL Configuration
      • 30.5. WAL Internals
    • 31. 邏輯複寫(Logical Replication)
      • 31.1. 發佈(Publication)
      • 31.2. 訂閱(Subscription)
      • 31.3. 衝突處理
      • 31.4. 限制
      • 31.5. 架構
      • 31.6. 監控
      • 31.7. 安全性
      • 31.8. 系統設定
      • 31.9. 快速設定
    • 32. Just-in-Time Compilation (JIT)
      • 32.1. What is JIT compilation?
      • 32.2. When to JIT?
      • 32.3. Configuration
      • 32.4. Extensibility
    • 33. 迴歸測試
      • 33.1. Running the Tests
      • 33.2. Test Evaluation
      • 33.3. Variant Comparison Files
      • 33.4. TAP Tests
      • 33.5. Test Coverage Examination
  • IV. 用戶端介面
    • 34. libpq - C Library
      • 34.1. 資料庫連線控制函數
      • 34.2. 連線狀態函數
      • 34.3. Command Execution Functions
      • 34.4. Asynchronous Command Processing
      • 34.5. Retrieving Query Results Row-By-Row
      • 34.6. Canceling Queries in Progress
      • 34.7. The Fast-Path Interface
      • 34.8. Asynchronous Notification
      • 34.9. Functions Associated with the COPY Command
      • 34.10. Control Functions
      • 34.11. Miscellaneous Functions
      • 34.12. Notice Processing
      • 34.13. Event System
      • 34.14. 環境變數
      • 34.15. 密碼檔
      • 34.16. The Connection Service File
      • 34.17. LDAP Lookup of Connection Parameters
      • 34.18. SSL Support
      • 34.19. Behavior in Threaded Programs
      • 34.20. Building libpq Programs
      • 34.21. Example Programs
    • 35. Large Objects
      • 35.1. Introduction
      • 35.2. Implementation Features
      • 35.3. Client Interfaces
      • 35.4. Server-side Functions
      • 35.5. Example Program
    • 36. ECPG - Embedded SQL in C
      • 36.1. The Concept
      • 36.2. Managing Database Connections
      • 36.3. Running SQL Commands
      • 36.4. Using Host Variables
      • 36.5. Dynamic SQL
      • 36.6. pgtypes Library
      • 36.7. Using Descriptor Areas
      • 36.8. Error Handling
      • 36.9. Preprocessor Directives
      • 36.10. Processing Embedded SQL Programs
      • 36.11. Library Functions
      • 36.12. Large Objects
      • 36.13. C++ Applications
      • 36.14. Embedded SQL Commands
      • 36.15. Informix Compatibility Mode
      • 36.16. Internals
    • 37. The Information Schema
      • 37.1. The Schema
      • 37.2. Data Types
      • 37.3. information_schema_catalog_name
      • 37.4. administrable_role_authorizations
      • 37.5. applicable_roles
      • 37.6. attributes
      • 37.7. character_sets
      • 37.8. check_constraint_routine_usage
      • 37.9. check_constraints
      • 37.10. collations
      • 37.11. collation_character_set_applicability
      • 37.12. column_domain_usage
      • 37.13. column_options
      • 37.14. column_privileges
      • 37.15. column_udt_usage
      • 37.16. columns
      • 37.17. constraint_column_usage
      • 37.18. constraint_table_usage
      • 37.19. data_type_privileges
      • 37.20. domain_constraints
      • 37.21. domain_udt_usage
      • 37.22. domains
      • 37.23. element_types
      • 37.24. enabled_roles
      • 37.25. foreign_data_wrapper_options
      • 37.26. foreign_data_wrappers
      • 37.27. foreign_server_options
      • 37.28. foreign_servers
      • 37.29. foreign_table_options
      • 37.30. foreign_tables
      • 37.31. key_column_usage
      • 37.32. parameters
      • 37.33. referential_constraints
      • 37.34. role_column_grants
      • 37.35. role_routine_grants
      • 37.36. role_table_grants
      • 37.37. role_udt_grants
      • 37.38. role_usage_grants
      • 37.39. routine_privileges
      • 37.40. routines
      • 37.41. schemata
      • 37.42. sequences
      • 37.43. sql_features
      • 37.44. sql_implementation_info
      • 37.45. sql_languages
      • 37.46. sql_packages
      • 37.47. sql_parts
      • 37.48. sql_sizing
      • 37.49. sql_sizing_profiles
      • 37.50. table_constraints
      • 37.51. table_privileges
      • 37.52. tables
      • 37.53. transforms
      • 37.54. triggered_update_columns
      • 37.55. triggers
      • 37.56. udt_privileges
      • 37.57. usage_privileges
      • 37.58. user_defined_types
      • 37.59. user_mapping_options
      • 37.60. user_mappings
      • 37.61. view_column_usage
      • 37.62. view_routine_usage
      • 37.63. view_table_usage
      • 37.64. views
  • V. 資料庫程式設計
    • 38. SQL 延伸功能
      • 38.1. How Extensibility Works
      • 38.2. The PostgreSQL Type System
      • 38.3. 使用者自訂函數
      • 38.4. User-defined Procedures
      • 38.5. Query Language (SQL) Functions
      • 38.6. Function Overloading
      • 38.7. 函數易變性類別
      • 38.8. Procedural Language Functions
      • 38.9. Internal Functions
      • 38.10. C-Language Functions
      • 38.11. User-defined Aggregates
      • 38.12. User-defined Types
      • 38.13. User-defined Operators
      • 38.14. Operator Optimization Information
      • 38.15. Interfacing Extensions To Indexes
      • 38.16. Packaging Related Objects into an Extension
      • 38.17. Extension Building Infrastructure
    • 39. Triggers
    • 40. Event Triggers
    • 41. 規則系統
      • 41.1. The Query Tree
      • 41.2. Views and the Rule System
      • 41.3. Materialized Views
      • 41.4. Rules on INSERT, UPDATE, and DELETE
      • 41.5. 規則及權限
      • 41.6. Rules and Command Status
      • 41.7. Rules Versus Triggers
    • 42. Procedural Languages(程序語言)
      • 42.1. Installing Procedural Languages
    • 43. PL/pgSQL - SQL Procedural Language
      • 43.5. 基本語法
    • 44. PL/Tcl - Tcl Procedural Language
    • 45. PL/Perl - Perl Procedural Language
    • 46. PL/Python - Python Procedural Language
    • 47. Server Programming Interface
    • 48. Background Worker Processes
    • 49. Logical Decoding
    • 50. Replication Progress Tracking
  • VI. 參考資訊
    • I. SQL 指令
      • ALTER DATABASE
      • ALTER DEFAULT PRIVILEGES
      • ALTER EXTENSION
      • ALTER FUNCTION
      • ALTER INDEX
      • ALTER LANGUAGE
      • ALTER MATERIALIZED VIEW
      • ALTER POLICY
      • ALTER PUBLICATION
      • ALTER ROLE
      • ALTER RULE
      • ALTER SCHEMA
      • ALTER SEQUENCE
      • ALTER STATISTICS
      • ALTER SUBSCRIPTION
      • ALTER TABLE
      • ALTER TABLESPACE
      • ALTER TRIGGER
      • ALTER TYPE
      • ALTER VIEW
      • ANALYZE
      • CLUSTER
      • COMMENT
      • COPY
      • CREATE CAST
      • CREATE DATABASE
      • CREATE EXTENSION
      • CREATE FOREIGN TABLE
      • CREATE FOREIGN DATA WRAPPER
      • CREATE FUNCTION
      • CREATE INDEX
      • CREATE LANGUAGE
      • CREATE MATERIALIZED VIEW
      • CREATE DOMAIN
      • CREATE POLICY
      • CREATE PROCEDURE
      • CREATE PUBLICATION
      • CREATE ROLE
      • CREATE RULE
      • CREATE SCHEMA
      • CREATE SEQUENCE
      • CREATE SERVER
      • CREATE STATISTICS
      • CREATE SUBSCRIPTION
      • CREATE TABLE
      • CREATE TABLE AS
      • CREATE TABLESPACE
      • CREATE TRANSFORM
      • CREATE TRIGGER
      • CREATE TYPE
      • CREATE USER
      • CREATE USER MAPPING
      • CREATE VIEW
      • DELETE
      • DO
      • DROP DATABASE
      • DROP EXTENSION
      • DROP FUNCTION
      • DROP INDEX
      • DROP LANGUAGE
      • DROP MATERIALIZED VIEW
      • DROP OWNED
      • DROP POLICY
      • DROP ROLE
      • DROP RULE
      • DROP SCHEMA
      • DROP SEQUENCE
      • DROP STATISTICS
      • DROP SUBSCRIPTION
      • DROP TABLE
      • DROP TABLESPACE
      • DROP TRANSFORM
      • DROP TRIGGER
      • DROP TYPE
      • DROP USER
      • DROP VIEW
      • EXECUTE
      • EXPLAIN
      • GRANT
      • IMPORT FOREIGN SCHEMA
      • INSERT
      • LISTEN
      • LOAD
      • NOTIFY
      • PREPARE TRANSACTION
      • REASSIGN OWNED
      • REFRESH MATERIALIZED VIEW
      • REINDEX
      • RESET
      • REVOKE
      • SELECT
      • SELECT INTO
      • SET
      • SET CONSTRAINTS
      • SET ROLE
      • SET SESSION AUTHORIZATION
      • SET TRANSACTION
      • SHOW
      • TRUNCATE
      • UNLISTEN
      • UPDATE
      • VACUUM
      • VALUES
    • II. PostgreSQL 用戶端工具
      • createdb
      • createuser
      • dropdb
      • dropuser
      • pgbench
      • pg_dump
      • psql
      • vacuumdb
    • III. PostgreSQL 伺服器應用程式
      • pg_test_timing
      • postgres
  • VII. 資料庫進階
    • 52. 系統目錄
      • 52.3. pg_am
      • 52.7. pg_attribute
      • 52.8. pg_authid
      • 52.9. pg_auth_members
      • 52.11 pg_class
      • 52.12. pg_collation
      • 52.13. pg_constraint
      • 52.15 pg_database
      • 52.26 pg_index
      • 52.29. pg_language
      • 52.32. pg_namespace
      • 52.33. pg_opclass
      • 52.38. pg_policy
      • 52.39. pg_proc
      • 52.44. pg_rewrite
      • 52.50. pg_statistic
      • 52.51. pg_statistic_ext
      • 52.54. pg_tablespace
      • 52.56. pg_trigger
      • 52.62. pg_type
      • 52.79. pg_replication_origin_status
      • 52.81 pg_roles
      • 52.85. pg_settings
      • 52.87. pg_stats
    • 53. Frontend/Backend Protocol
      • 53.1. Overview
      • 53.2. Message Flow
      • 53.3. SASL Authentication
      • 53.4. Streaming Replication Protocol
      • 53.5. Logical Streaming Replication Protocol
      • 53.6. Message Data Types
      • 53.7. Message Formats
      • 53.8. Error and Notice Message Fields
      • 53.9. Logical Replication Message Formats
      • 53.10. Summary of Changes since Protocol 2.0
    • 54. PostgreSQL 程式撰寫慣例
      • 54.1. Formatting
      • 54.2. Reporting Errors Within the Server
      • 54.3. Error Message Style Guide
      • 54.4. Miscellaneous Coding Conventions
    • 56. Writing A Procedural Language Handler
    • 64. GiST Indexes
      • 64.1. Introduction
      • 64.2. Built-in Operator Classes
      • 64.3. Extensibility
      • 64.4. Implementation
      • 64.5. Examples
    • 65. SP-GiST Indexes
      • 65.1. Introduction
      • 65.2. Built-in Operator Classes
      • 65.3. Extensibility
      • 65.4. Implementation
      • 65.5. Examples
    • 66. GIN 索引
      • 66.1. 簡介
      • 66.2. 內建運算子類
      • 66.3. Extensibility
      • 66.4. Implementation
      • 66.5. GIN Tips and Tricks
      • 66.6. Limitations
      • 66.7. Examples
    • 67. BRIN Indexes
      • 67.1. Introduction
      • 67.2. Built-in Operator Classes
      • 67.3. Extensibility
    • 68. 資料庫實體儲存格式
      • 68.2. TOAST
      • 68.4 可視性映射表(Visibility Map)
    • 70. How the Planner Uses Statistics
      • 70.2. Multivariate Statistics Examples
  • VIII. 附錄
    • A. PostgreSQL錯誤代碼
    • B. 日期時間格式支援
      • B.1. 日期時間解譯流程
      • B.2. 日期時間慣用字
      • B.3. 日期時間設定檔
      • B.4. 日期時間的沿革
    • C. SQL 關鍵字
    • D. SQL 相容性
    • E. 版本資訊
    • F. 延伸支援模組
      • F.4. auto_explain
      • F.11. dblink
        • dblink
      • F.33. pg_visibility
    • G. Additional Supplied Programs
      • G.1. Client Applications
        • oid2name
        • vacuumlo
      • G.2. Server Applications
        • pg_standby
    • H. 外部專案
      • H.1. 用戶端介面
      • H.2. Administration Tools
      • H.3. Procedural Languages
      • H.4. Extensions
    • I. The Source Code Repository
      • I.1. Getting The Source via Git
    • J. 文件取得
    • K. 縮寫字
  • 參考書目
Powered by GitBook
On this page
  • 14.4.1. Disable Autocommit
  • 14.4.2. UseCOPY
  • 14.4.3. Remove Indexes
  • 14.4.4. Remove Foreign Key Constraints
  • 14.4.5. Increasemaintenance_work_mem
  • 14.4.6. Increasemax_wal_size
  • 14.4.7. Disable WAL Archival and Streaming Replication
  • 14.4.8. RunANALYZEAfterwards
  • 14.4.9. Some Notes Aboutpg_dump

Was this helpful?

Edit on Git
Export as PDF
  1. II. SQL查詢語言
  2. 14. 效能技巧

14.4. 快速建立資料庫內容

Previous14.3. 使用確切的JOIN方式Next14.5. 彈性設定

Last updated 7 years ago

Was this helpful?

One might need to insert a large amount of data when first populating a database. This section contains some suggestions on how to make this process as efficient as possible.

14.4.1. Disable Autocommit

When using multipleINSERTs, turn off autocommit and just do one commit at the end. (In plain SQL, this means issuingBEGINat the start andCOMMITat the end. Some client libraries might do this behind your back, in which case you need to make sure the library does it when you want it done.) If you allow each insertion to be committed separately,PostgreSQLis doing a lot of work for each row that is added. An additional benefit of doing all insertions in one transaction is that if the insertion of one row were to fail then the insertion of all rows inserted up to that point would be rolled back, so you won't be stuck with partially loaded data.

14.4.2. UseCOPY

Useto load all the rows in one command, instead of using a series ofINSERTcommands. TheCOPYcommand is optimized for loading large numbers of rows; it is less flexible thanINSERT, but incurs significantly less overhead for large data loads. SinceCOPYis a single command, there is no need to disable autocommit if you use this method to populate a table.

If you cannot useCOPY, it might help to useto create a preparedINSERTstatement, and then useEXECUTEas many times as required. This avoids some of the overhead of repeatedly parsing and planningINSERT. Different interfaces provide this facility in different ways; look for“prepared statements”in the interface documentation.

Note that loading a large number of rows usingCOPYis almost always faster than usingINSERT, even ifPREPAREis used and multiple insertions are batched into a single transaction.

14.4.3. Remove Indexes

If you are loading a freshly created table, the fastest method is to create the table, bulk load the table's data usingCOPY, then create any indexes needed for the table. Creating an index on pre-existing data is quicker than updating it incrementally as each row is loaded.

If you are adding large amounts of data to an existing table, it might be a win to drop the indexes, load the table, and then recreate the indexes. Of course, the database performance for other users might suffer during the time the indexes are missing. One should also think twice before dropping a unique index, since the error checking afforded by the unique constraint will be lost while the index is missing.

14.4.4. Remove Foreign Key Constraints

Just as with indexes, a foreign key constraint can be checked“in bulk”more efficiently than row-by-row. So it might be useful to drop foreign key constraints, load data, and re-create the constraints. Again, there is a trade-off between data load speed and loss of error checking while the constraint is missing.

What's more, when you load data into a table with existing foreign key constraints, each new row requires an entry in the server's list of pending trigger events (since it is the firing of a trigger that checks the row's foreign key constraint). Loading many millions of rows can cause the trigger event queue to overflow available memory, leading to intolerable swapping or even outright failure of the command. Therefore it may benecessary, not just desirable, to drop and re-apply foreign keys when loading large amounts of data. If temporarily removing the constraint isn't acceptable, the only other recourse may be to split up the load operation into smaller transactions.

14.4.5. Increasemaintenance_work_mem

14.4.6. Increasemax_wal_size

14.4.7. Disable WAL Archival and Streaming Replication

Aside from avoiding the time for the archiver or WAL sender to process the WAL data, doing this will actually make certain commands faster, because they are designed not to write WAL at all ifwal_levelisminimal. (They can guarantee crash safety more cheaply by doing anfsyncat the end than by writing WAL.) This applies to the following commands:

  • CREATE TABLE AS SELECT

  • CREATE INDEX(and variants such asALTER TABLE ADD PRIMARY KEY)

  • ALTER TABLE SET TABLESPACE

  • CLUSTER

  • COPY FROM, when the target table has been created or truncated earlier in the same transaction

14.4.8. RunANALYZEAfterwards

14.4.9. Some Notes Aboutpg_dump

Dump scripts generated bypg_dumpautomatically apply several, but not all, of the above guidelines. To reload apg_dumpdump as quickly as possible, you need to do a few extra things manually. (Note that these points apply while_restoring_a dump, not while_creating_it. The same points apply whether loading a text dump withpsqlor usingpg_restoreto load from apg_dumparchive file.)

By default,pg_dumpusesCOPY, and when it is generating a complete schema-and-data dump, it is careful to load data before creating indexes and foreign keys. So in this case several guidelines are handled automatically. What is left for you to do is to:

  • Set appropriate (i.e., larger than normal) values formaintenance_work_memandmax_wal_size.

  • If using WAL archiving or streaming replication, consider disabling them during the restore. To do that, setarchive_modetooff,wal_leveltominimal, andmax_wal_sendersto zero before loading the dump. Afterwards, set them back to the right values and take a fresh base backup.

  • Experiment with the parallel dump and restore modes of bothpg_dumpandpg_restoreand find the optimal number of concurrent jobs to use. Dumping and restoring in parallel by means of the-joption should give you a significantly higher performance over the serial mode.

  • Consider whether the whole dump should be restored as a single transaction. To do that, pass the-1or--single-transactioncommand-line option topsqlorpg_restore. When using this mode, even the smallest of errors will rollback the entire restore, possibly discarding many hours of processing. Depending on how interrelated the data is, that might seem preferable to manual cleanup, or not.COPYcommands will run fastest if you use a single transaction and have WAL archiving turned off.

  • If multiple CPUs are available in the database server, consider usingpg_restore's--jobsoption. This allows concurrent data loading and index creation.

  • RunANALYZEafterwards.

COPYis fastest when used within the same transaction as an earlierCREATE TABLEorTRUNCATEcommand. In such cases no WAL needs to be written, because in case of an error, the files containing the newly loaded data will be removed anyway. However, this consideration only applies whenisminimalas all commands must write WAL otherwise.

Temporarily increasing theconfiguration variable when loading large amounts of data can lead to improved performance. This will help to speed upCREATE INDEXcommands andALTER TABLE ADD FOREIGN KEYcommands. It won't do much forCOPYitself, so this advice is only useful when you are using one or both of the above techniques.

Temporarily increasing theconfiguration variable can also make large data loads faster. This is because loading a large amount of data intoPostgreSQLwill cause checkpoints to occur more often than the normal checkpoint frequency (specified by thecheckpoint_timeoutconfiguration variable). Whenever a checkpoint occurs, all dirty pages must be flushed to disk. By increasingmax_wal_sizetemporarily during bulk data loads, the number of checkpoints that are required can be reduced.

When loading large amounts of data into an installation that uses WAL archiving or streaming replication, it might be faster to take a new base backup after the load has completed than to process a large amount of incremental WAL data. To prevent incremental WAL logging while loading, disable archiving and streaming replication, by settingtominimal,tooff, andto zero. But note that changing these settings requires a server restart.

Whenever you have significantly altered the distribution of data within a table, runningis strongly recommended. This includes bulk loading large amounts of data into the table. RunningANALYZE(orVACUUM ANALYZE) ensures that the planner has up-to-date statistics about the table. With no statistics or obsolete statistics, the planner might make poor decisions during query planning, leading to poor performance on any tables with inaccurate or nonexistent statistics. Note that if the autovacuum daemon is enabled, it might runANALYZEautomatically; seeandfor more information.

A data-only dump will still useCOPY, but it does not drop or recreate indexes, and it does not normally touch foreign keys.So when loading a data-only dump, it is up to you to drop and recreate indexes and foreign keys if you wish to use those techniques. It's still useful to increasemax_wal_sizewhile loading the data, but don't bother increasingmaintenance_work_mem; rather, you'd do that while manually recreating indexes and foreign keys afterwards. And don't forget toANALYZEwhen you're done; seeandfor more information.

You can get the effect of disabling foreign keys by using the--disable-triggersoption — but realize that that eliminates, rather than just postpones, foreign key validation, and so it is possible to insert bad data if you use it.

14.4.1. Disable Autocommit
14.4.2. UseCOPY
14.4.3. Remove Indexes
14.4.4. Remove Foreign Key Constraints
14.4.5. Increasemaintenance_work_mem
14.4.6. Increasemax_wal_size
14.4.7. Disable WAL Archival and Streaming Replication
14.4.8. RunANALYZEAfterwards
14.4.9. Some Notes Aboutpg_dump
COPY
PREPARE
wal_level
maintenance_work_mem
max_wal_size
wal_level
archive_mode
max_wal_senders
ANALYZE
Section 24.1.3
Section 24.1.6
[10]
Section 24.1.3
Section 24.1.6
[10]