There are several settings which can cause the query planner not to generate a parallel query plan under any circumstances. In order for any parallel query plans whatsoever to be generated, the following settings must be configured as indicated.
max_parallel_workers_per_gathermust be set to a value which is greater than zero. This is a special case of the more general principle that no more workers should be used than the number configured viamax_parallel_workers_per_gather
.
dynamic_shared_memory_typemust be set to a value other thannone
. Parallel query requires dynamic shared memory in order to pass data between cooperating processes.
In addition, the system must not be running in single-user mode. Since the entire database system is running in single process in this situation, no background workers will be available.
Even when it is in general possible for parallel query plans to be generated, the planner will not generate them for a given query if any of the following are true:
The query writes any data or locks any database rows. If a query contains a data-modifying operation either at the top level or within a CTE, no parallel plans for that query will be generated. This is a limitation of the current implementation which could be lifted in a future release.
The query might be suspended during execution. In any situation in which the system thinks that partial or incremental execution might occur, no parallel plan is generated. For example, a cursor created usingDECLARE CURSORwill never use a parallel plan. Similarly, a PL/pgSQL loop of the formFOR x IN query LOOP .. END LOOP
will never use a parallel plan, because the parallel query system is unable to verify that the code in the loop is safe to execute while parallel query is active.
The query uses any function markedPARALLEL UNSAFE
. Most system-defined functions arePARALLEL SAFE
, but user-defined functions are markedPARALLEL UNSAFE
by default. See the discussion ofSection 15.4.
The query is running inside of another query that is already parallel. For example, if a function called by a parallel query issues an SQL query itself, that query will never use a parallel plan. This is a limitation of the current implementation, but it may not be desirable to remove this limitation, since it could result in a single query using a very large number of processes.
The transaction isolation level is serializable. This is a limitation of the current implementation.
Even when parallel query plan is generated for a particular query, there are several circumstances under which it will be impossible to execute that plan in parallel at execution time. If this occurs, the leader will execute the portion of the plan below theGather
node entirely by itself, almost as if theGather
node were not present. This will happen if any of the following conditions are met:
No background workers can be obtained because of the limitation that the total number of background workers cannot exceedmax_worker_processes.
No background workers can be obtained because of the limitation that the total number of background workers launched for purposes of parallel query cannot exceedmax_parallel_workers.
The client sends an Execute message with a non-zero fetch count. See the discussion of theextended query protocol. Sincelibpqcurrently provides no way to send such a message, this can only occur when using a client that does not rely on libpq. If this is a frequent occurrence, it may be a good idea to setmax_parallel_workers_per_gatherin sessions where it is likely, so as to avoid generating query plans that may be suboptimal when run serially.
A prepared statement is executed using aCREATE TABLE .. AS EXECUTE ..
statement. This construct converts what otherwise would have been a read-only operation into a read-write operation, making it ineligible for parallel query.
The transaction isolation level is serializable. This situation does not normally arise, because parallel query plans are not generated when the transaction isolation level is serializable. However, it can happen if the transaction isolation level is changed to serializable after the plan is generated and before it is executed.
PostgreSQL 可以設計平行運算的查詢計劃,利用多個 CPU 來更快地回應查詢。此功能稱為平行查詢。許多查詢無法從平行查詢中受益,或者是由於目前實作的限制,或者因為沒有比序列查詢計劃可以想到更快的查詢計劃。但是,對於可以受益的查詢,平行查詢的加速通常非常重要。使用平行查詢時,許多查詢的執行速度可能會提高兩倍以上,並且某些查詢的執行速度可能會提高四倍甚至更多。涉及大量資料但只向使用者回傳少量資料列的查詢通常會受益最多。本章解釋了一些關於平行查詢如何工作的細節,以及在哪些情況下可以使用這些細節,以便希望使用它的使用者可以理解期望的內容。
15.4.1. Parallel Labeling for Functions and Aggregates
The planner classifies operations involved in a query as eitherparallel safe,parallel restricted, orparallel unsafe. A parallel safe operation is one which does not conflict with the use of parallel query. A parallel restricted operation is one which cannot be performed in a parallel worker, but which can be performed in the leader while parallel query is in use. Therefore, parallel restricted operations can never occur below aGather
node, but can occur elsewhere in a plan which contains aGather
node. A parallel unsafe operation is one which cannot be performed while parallel query is in use, not even in the leader. When a query contains anything which is parallel unsafe, parallel query is completely disabled for that query.
The following operations are always parallel restricted.
Scans of common table expressions (CTEs).
Scans of temporary tables.
Scans of foreign tables, unless the foreign data wrapper has anIsForeignScanParallelSafe
API which indicates otherwise.
Access to anInitPlan
orSubPlan
.
The planner cannot automatically determine whether a user-defined function or aggregate is parallel safe, parallel restricted, or parallel unsafe, because this would require predicting every operation which the function could possibly perform. In general, this is equivalent to the Halting Problem and therefore impossible. Even for simple functions where it conceivably be done, we do not try, since this would be expensive and error-prone. Instead, all user-defined functions are assumed to be parallel unsafe unless otherwise marked. When usingCREATE FUNCTIONorALTER FUNCTION, markings can be set by specifyingPARALLEL SAFE
,PARALLEL RESTRICTED
, orPARALLEL UNSAFE
as appropriate. When usingCREATE AGGREGATE, thePARALLEL
option can be specified withSAFE
,RESTRICTED
, orUNSAFE
as the corresponding value.
Functions and aggregates must be markedPARALLEL UNSAFE
if they write to the database, access sequences, change the transaction state even temporarily (e.g. a PL/pgSQL function which establishes anEXCEPTION
block to catch errors), or make persistent changes to settings. Similarly, functions must be markedPARALLEL RESTRICTED
if they access temporary tables, client connection state, cursors, prepared statements, or miscellaneous backend-local state which the system cannot synchronize across workers. For example,setseed
andrandom
are parallel restricted for this last reason.
In general, if a function is labeled as being safe when it is restricted or unsafe, or if it is labeled as being restricted when it is in fact unsafe, it may throw errors or produce wrong answers when used in a parallel query. C-language functions could in theory exhibit totally undefined behavior if mislabeled, since there is no way for the system to protect itself against arbitrary C code, but in most likely cases the result will be no worse than for any other function. If in doubt, it is probably best to label functions asUNSAFE
.
If a function executed within a parallel worker acquires locks which are not held by the leader, for example by querying a table not referenced in the query, those locks will be released at worker exit, not end of transaction. If you write a function which does this, and this behavior difference is important to you, mark such functions asPARALLEL RESTRICTED
to ensure that they execute only in the leader.
Note that the query planner does not consider deferring the evaluation of parallel-restricted functions or aggregates involved in the query in order to obtain a superior plan. So, for example, if aWHERE
clause applied to a particular table is parallel restricted, the query planner will not consider placing the scan of that table below aGather
node. In some cases, it would be possible (and perhaps even efficient) to include the scan of that table in the parallel portion of the query and defer the evaluation of theWHERE
clause so that it happens above theGather
node. However, the planner does not do this.
當優化器確定平行查詢是某個查詢的最快執行策略時,它將建立一個查詢計劃,其中包含一個 Gather 或 Gather Merge 節點。這是一個簡單的例子:
在所有情況下,Gather 或 Gather Merge 合併節點都只有一個子計劃,它是平行執行計劃的一部分。如果 Gather 或 Gather Merge 節點位於計劃樹的頂部,則整個查詢將會平行執行。如果它在計劃樹中的其他位置,那麼只有它下面的計劃部分將平行運行。在上面的範例中,查詢只存取一個資料表,因此除了 Gather 節點本身之外,只有一個計劃節點;由於該計劃節點是 Gather 節點的子節點,因此它將平行運行。
使用 EXPLAIN,您可以看到規劃器選擇的後端程序數量。當在查詢執行過程中到達 Gather 節點時,執行使用者連線的程序將請求與規劃器選擇的後端程序數量相等的後端工作程序。規劃器將考慮使用的後端工作程序數量最多限制為 max_parallel_workers_per_gather。任何時候可以存在的後端工作程序總數受 max_worker_processes 和 max_parallel_workers 限制。因此,平行查詢可以用比計劃更少的工作程序運行,甚至根本不需要工作程序。最佳計劃可能取決於可用的工作程序數量,因此這可能會導致查詢性能較差。如果頻繁發生,請考慮增加 max_worker_processes 和 max_parallel_workers,以便可以同時運行更多工作程序,或者減少 max_parallel_workers_per_gather,以便規劃器請求更少的工作程序。
為給予的平行查詢成功啟動的每個後端工作程序都將執行計劃的平行部分。領導程序也會執行計劃的一部分,但它還有一個額外的責任:它還必須讀取其他後諯程序所生成的所有資料。當計劃的平行部分只産生少量資料時,領導程序通常會表現得非常像一個額外的後端程序,而加快了查詢的執行速度。相反地,當計劃的平行部分産生大量資料時,領導程序可能幾乎完全被讀取由後端程序産生的資料所佔據,並且執行 Gather 節點級之上的計劃節點所需的任何進一步處理步驟或是扮演 Gather Merge 節點。在這種情況下,領導程序將會少量執行計劃的平行部分。
當計劃的平行行部分頂部的節點是 Gather Merge 而不是 Gather 時,它表示執行計劃的平行部分的每個程序正在按排序順序産生資料,並且領導程序正依序執行保持合併。相反地,Gather 只以便利的順序讀取後端程序的資料,會破壞可能存在的任何排序順序。
Because each worker executes the parallel portion of the plan to completion, it is not possible to simply take an ordinary query plan and run it using multiple workers. Each worker would produce a full copy of the output result set, so the query would not run any faster than normal but would produce incorrect results. Instead, the parallel portion of the plan must be what is known internally to the query optimizer as apartial plan; that is, it must be constructed so that each process which executes the plan will generate only a subset of the output rows in such a way that each required output row is guaranteed to be generated by exactly one of the cooperating processes.
The following types of parallel-aware table scans are currently supported.
In aparallel sequential scan, the table's blocks will be divided among the cooperating processes. Blocks are handed out one at a time, so that access to the table remains sequential.
In aparallel bitmap heap scan, one process is chosen as the leader. That process performs a scan of one or more indexes and builds a bitmap indicating which table blocks need to be visited. These blocks are then divided among the cooperating processes as in a parallel sequential scan. In other words, the heap scan is performed in parallel, but the underlying index scan is not.
In aparallel index scan_or_parallel index-only scan, the cooperating processes take turns reading data from the index. Currently, parallel index scans are supported only for btree indexes. Each process will claim a single index block and will scan and return all tuples referenced by that block; other process can at the same time be returning tuples from a different index block. The results of a parallel btree scan are returned in sorted order within each worker process.
Only the scan types listed above may be used for a scan on the driving table within a parallel plan. Other scan types, such as parallel scans of non-btree indexes, may be supported in the future.
Just as in a non-parallel plan, the driving table may be joined to one or more other tables using a nested loop, hash join, or merge join. The inner side of the join may be any kind of non-parallel plan that is otherwise supported by the planner provided that it is safe to run within a parallel worker. For example, if a nested loop join is chosen, the inner plan may be an index scan which looks up a value taken from the outer side of the join.
Each worker will execute the inner side of the join in full. This is typically not a problem for nested loops, but may be inefficient for cases involving hash or merge joins. For example, for a hash join, this restriction means that an identical hash table is built in each worker process, which works fine for joins against small tables but may not be efficient when the inner table is large. For a merge join, it might mean that each worker performs a separate sort of the inner relation, which could be slow. Of course, in cases where a parallel plan of this type would be inefficient, the query planner will normally choose some other plan (possibly one which does not use parallelism) instead.
PostgreSQLsupports parallel aggregation by aggregating in two stages. First, each process participating in the parallel portion of the query performs an aggregation step, producing a partial result for each group of which that process is aware. This is reflected in the plan as aPartial Aggregate
node. Second, the partial results are transferred to the leader via theGather
node. Finally, the leader re-aggregates the results across all workers in order to produce the final result. This is reflected in the plan as aFinalize Aggregate
node.
Because theFinalize Aggregate
node runs on the leader process, queries which produce a relatively large number of groups in comparison to the number of input rows will appear less favorable to the query planner. For example, in the worst-case scenario the number of groups seen by theFinalize Aggregate
node could be as many as the number of input rows which were seen by all worker processes in thePartial Aggregate
stage. For such cases, there is clearly going to be no performance benefit to using parallel aggregation. The query planner takes this into account during the planning process and is unlikely to choose parallel aggregate in this scenario.
Parallel aggregation is not supported in all situations. Each aggregate must besafefor parallelism and must have a combine function. If the aggregate has a transition state of typeinternal
, it must have serialization and deserialization functions. SeeCREATE AGGREGATEfor more details. Parallel aggregation is not supported if any aggregate function call containsDISTINCT
orORDER BY
clause and is also not supported for ordered set aggregates or when the query involvesGROUPING SETS
. It can only be used when all joins involved in the query are also part of the parallel portion of the plan.
If a query that is expected to do so does not produce a parallel plan, you can try reducingparallel_setup_costorparallel_tuple_cost. Of course, this plan may turn out to be slower than the serial plan which the planner preferred, but this will not always be the case. If you don't get a parallel plan even with very small values of these settings (e.g. after setting them both to zero), there may be some reason why the query planner is unable to generate a parallel plan for your query. SeeSection 15.2andSection 15.4for information on why this may be the case.
When executing a parallel plan, you can useEXPLAIN (ANALYZE, VERBOSE)
to display per-worker statistics for each plan node. This may be useful in determining whether the work is being evenly distributed between all plan nodes and more generally in understanding the performance characteristics of the plan.