F.43. tablefunc
tablefunc 模組內含了回傳資料表(即多筆資料列)的各種函數。這些函數本身很有用,也可以用作設計回傳多筆資料列的 C 函數的範例。
F.43.1. Functions Provided
Table F.30 列出了 tablefunc 模組所提供的函數。
Table F.30. tablefunc
Functions
tablefunc
FunctionsF.43.1.1. Normal_rand
normal_rand
produces a set of normally distributed random values (Gaussian distribution).
numvals
is the number of values to be returned from the function. mean
is the mean of the normal distribution of values and stddev
is the standard deviation of the normal distribution of values.
For example, this call requests 1000 values with a mean of 5 and a standard deviation of 3:
F.43.1.2. Crosstab(Text)
The crosstab
function is used to produce “pivot” displays, wherein data is listed across the page rather than down. For example, we might have data like
which we wish to display like
The crosstab
function takes a text parameter that is a SQL query producing raw data formatted in the first way, and produces a table formatted in the second way.
The sql
parameter is a SQL statement that produces the source set of data. This statement must return one row_name
column, one category
column, and one value
column. N
is an obsolete parameter, ignored if supplied (formerly this had to match the number of output value columns, but now that is determined by the calling query).
For example, the provided query might produce a set something like:
The crosstab
function is declared to return setof record
, so the actual names and types of the output columns must be defined in the FROM
clause of the calling SELECT
statement, for example:
This example produces a set something like:
The FROM
clause must define the output as one row_name
column (of the same data type as the first result column of the SQL query) followed by N value
columns (all of the same data type as the third result column of the SQL query). You can set up as many output value columns as you wish. The names of the output columns are up to you.
The crosstab
function produces one output row for each consecutive group of input rows with the same row_name
value. It fills the output value
columns, left to right, with the value
fields from these rows. If there are fewer rows in a group than there are output value
columns, the extra output columns are filled with nulls; if there are more rows, the extra input rows are skipped.
In practice the SQL query should always specify ORDER BY 1,2
to ensure that the input rows are properly ordered, that is, values with the same row_name
are brought together and correctly ordered within the row. Notice that crosstab
itself does not pay any attention to the second column of the query result; it's just there to be ordered by, to control the order in which the third-column values appear across the page.
Here is a complete example:
You can avoid always having to write out a FROM
clause to define the output columns, by setting up a custom crosstab function that has the desired output row type wired into its definition. This is described in the next section. Another possibility is to embed the required FROM
clause in a view definition.
另請參閱 psql 中的 \crosstabview 指令,該指令提供的功能類似於 crosstab()。
F.43.1.3. CrosstabN(Text)
The crosstab
N
functions are examples of how to set up custom wrappers for the general crosstab
function, so that you need not write out column names and types in the calling SELECT
query. The tablefunc
module includes crosstab2
, crosstab3
, and crosstab4
, whose output row types are defined as
Thus, these functions can be used directly when the input query produces row_name
and value
columns of type text
, and you want 2, 3, or 4 output values columns. In all other ways they behave exactly as described above for the general crosstab
function.
For instance, the example given in the previous section would also work as
These functions are provided mostly for illustration purposes. You can create your own return types and functions based on the underlying crosstab()
function. There are two ways to do it:
Create a composite type describing the desired output columns, similar to the examples in
contrib/tablefunc/tablefunc--1.0.sql
. Then define a unique function name accepting onetext
parameter and returningsetof your_type_name
, but linking to the same underlyingcrosstab
C function. For example, if your source data produces row names that aretext
, and values that arefloat8
, and you want 5 value columns:Use
OUT
parameters to define the return type implicitly. The same example could also be done this way:
F.43.1.4. Crosstab(Text, Text)
The main limitation of the single-parameter form of crosstab
is that it treats all values in a group alike, inserting each value into the first available column. If you want the value columns to correspond to specific categories of data, and some groups might not have data for some of the categories, that doesn't work well. The two-parameter form of crosstab
handles this case by providing an explicit list of the categories corresponding to the output columns.
source_sql
is a SQL statement that produces the source set of data. This statement must return one row_name
column, one category
column, and one value
column. It may also have one or more “extra” columns. The row_name
column must be first. The category
and value
columns must be the last two columns, in that order. Any columns between row_name
and category
are treated as “extra”. The “extra” columns are expected to be the same for all rows with the same row_name
value.
For example, source_sql
might produce a set something like:
category_sql
is a SQL statement that produces the set of categories. This statement must return only one column. It must produce at least one row, or an error will be generated. Also, it must not produce duplicate values, or an error will be generated. category_sql
might be something like:
The crosstab
function is declared to return setof record
, so the actual names and types of the output columns must be defined in the FROM
clause of the calling SELECT
statement, for example:
This will produce a result something like:
The FROM
clause must define the proper number of output columns of the proper data types. If there are N
columns in the source_sql
query's result, the first N
-2 of them must match up with the first N
-2 output columns. The remaining output columns must have the type of the last column of the source_sql
query's result, and there must be exactly as many of them as there are rows in the category_sql
query's result.
The crosstab
function produces one output row for each consecutive group of input rows with the same row_name
value. The output row_name
column, plus any “extra” columns, are copied from the first row of the group. The output value
columns are filled with the value
fields from rows having matching category
values. If a row's category
does not match any output of the category_sql
query, its value
is ignored. Output columns whose matching category is not present in any input row of the group are filled with nulls.
In practice the source_sql
query should always specify ORDER BY 1
to ensure that values with the same row_name
are brought together. However, ordering of the categories within a group is not important. Also, it is essential to be sure that the order of the category_sql
query's output matches the specified output column order.
Here are two complete examples:
You can create predefined functions to avoid having to write out the result column names and types in each query. See the examples in the previous section. The underlying C function for this form of crosstab
is named crosstab_hash
.
F.43.1.5. Connectby
The connectby
function produces a display of hierarchical data that is stored in a table. The table must have a key field that uniquely identifies rows, and a parent-key field that references the parent (if any) of each row. connectby
can display the sub-tree descending from any row.
Table F.31 explains the parameters.
Table F.31. connectby
Parameters
connectby
ParametersThe key and parent-key fields can be any data type, but they must be the same type. Note that the start_with
value must be entered as a text string, regardless of the type of the key field.
The connectby
function is declared to return setof record
, so the actual names and types of the output columns must be defined in the FROM
clause of the calling SELECT
statement, for example:
The first two output columns are used for the current row's key and its parent row's key; they must match the type of the table's key field. The third output column is the depth in the tree and must be of type integer
. If a branch_delim
parameter was given, the next output column is the branch display and must be of type text
. Finally, if an orderby_fld
parameter was given, the last output column is a serial number, and must be of type integer
.
The “branch” output column shows the path of keys taken to reach the current row. The keys are separated by the specified branch_delim
string. If no branch display is wanted, omit both the branch_delim
parameter and the branch column in the output column list.
If the ordering of siblings of the same parent is important, include the orderby_fld
parameter to specify which field to order siblings by. This field can be of any sortable data type. The output column list must include a final integer serial-number column, if and only if orderby_fld
is specified.
The parameters representing table and field names are copied as-is into the SQL queries that connectby
generates internally. Therefore, include double quotes if the names are mixed-case or contain special characters. You may also need to schema-qualify the table name.
In large tables, performance will be poor unless there is an index on the parent-key field.
It is important that the branch_delim
string not appear in any key values, else connectby
may incorrectly report an infinite-recursion error. Note that if branch_delim
is not provided, a default value of ~
is used for recursion detection purposes.
Here is an example:
F.43.2. Author
Joe Conway