Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Functions in PL/Python are declared via the standard CREATE FUNCTION syntax:
The body of a function is simply a Python script. When the function is called, its arguments are passed as elements of the list args
; named arguments are also passed as ordinary variables to the Python script. Use of named arguments is usually more readable. The result is returned from the Python code in the usual way, with return
or yield
(in case of a result-set statement). If you do not provide a return value, Python returns the default None
. PL/Python translates Python's None
into the SQL null value. In a procedure, the result from the Python code must be None
(typically achieved by ending the procedure without a return
statement or by using a return
statement without argument); otherwise, an error will be raised.
For example, a function to return the greater of two integers can be defined as:
The Python code that is given as the body of the function definition is transformed into a Python function. For example, the above results in:
assuming that 23456 is the OID assigned to the function by PostgreSQL.
The arguments are set as global variables. Because of the scoping rules of Python, this has the subtle consequence that an argument variable cannot be reassigned inside the function to the value of an expression that involves the variable name itself, unless the variable is redeclared as global in the block. For example, the following won't work:
because assigning to x
makes x
a local variable for the entire block, and so the x
on the right-hand side of the assignment refers to a not-yet-assigned local variable x
, not the PL/Python function parameter. Using the global
statement, this can be made to work:
But it is advisable not to rely on this implementation detail of PL/Python. It is better to treat the function parameters as read-only.
全域字典變數 SD 可用於在重複呼叫同一函數之間所儲存的私有資料。全域字典變數 GD 是公用資料,可用於同一個資料庫連線中的所有 Python 函數; 請小心使用。
每個函數在 Python 直譯器中都有其自己的執行環境,因此 myfunc2 不能使用 myfunc 的全域變數和函數參數。如上所述,其例外是 GD 字典變數中的資料。
When a function is used as a trigger, the dictionary TD
contains trigger-related values:TD["event"]
contains the event as a string: INSERT
, UPDATE
, DELETE
, or TRUNCATE
.TD["when"]
contains one of BEFORE
, AFTER
, or INSTEAD OF
.TD["level"]
contains ROW
or STATEMENT
.TD["new"]
TD["old"]
For a row-level trigger, one or both of these fields contain the respective trigger rows, depending on the trigger event.TD["name"]
contains the trigger name.TD["table_name"]
contains the name of the table on which the trigger occurred.TD["table_schema"]
contains the schema of the table on which the trigger occurred.TD["relid"]
contains the OID of the table on which the trigger occurred.TD["args"]
If the CREATE TRIGGER
command included arguments, they are available in TD["args"][0]
to TD["args"][
n
-1].
If TD["when"]
is BEFORE
or INSTEAD OF
and TD["level"]
is ROW
, you can return None
or "OK"
from the Python function to indicate the row is unmodified, "SKIP"
to abort the event, or if TD["event"]
is INSERT
or UPDATE
you can return "MODIFY"
to indicate you've modified the new row. Otherwise the return value is ignored.
Recovering from errors caused by database access as described in Section 45.7.2 can lead to an undesirable situation where some operations succeed before one of them fails, and after recovering from that error the data is left in an inconsistent state. PL/Python offers a solution to this problem in the form of explicit subtransactions.
Consider a function that implements a transfer between two accounts:
If the second UPDATE
statement results in an exception being raised, this function will report the error, but the result of the first UPDATE
will nevertheless be committed. In other words, the funds will be withdrawn from Joe's account, but will not be transferred to Mary's account.
To avoid such issues, you can wrap your plpy.execute
calls in an explicit subtransaction. The plpy
module provides a helper object to manage explicit subtransactions that gets created with the plpy.subtransaction()
function. Objects created by this function implement the context manager interface. Using explicit subtransactions we can rewrite our function as:
Note that the use of try/catch
is still required. Otherwise the exception would propagate to the top of the Python stack and would cause the whole function to abort with a PostgreSQL error, so that the operations
table would not have any row inserted into it. The subtransaction context manager does not trap errors, it only assures that all database operations executed inside its scope will be atomically committed or rolled back. A rollback of the subtransaction block occurs on any kind of exception exit, not only ones caused by errors originating from database access. A regular Python exception raised inside an explicit subtransaction block would also cause the subtransaction to be rolled back.
Context managers syntax using the with
keyword is available by default in Python 2.6. If using PL/Python with an older Python version, it is still possible to use explicit subtransactions, although not as transparently. You can call the subtransaction manager's __enter__
and __exit__
functions using the enter
and exit
convenience aliases. The example function that transfers funds could be written as:
Although context managers were implemented in Python 2.5, to use the with
syntax in that version you need to use a future statement. Because of implementation details, however, you cannot use future statements in PL/Python functions.
PL/Python 程序語言允許 PostgreSQL 函數以 Python 語言撰寫。
要將 PL/Python 安裝在特定的資料庫中,請使用 CREATE EXTENSION plpythonu
(另請參閱第 45.1 節)。
如果將語言安裝到 template1 中,則所有隨後建立的資料庫將自動安裝該語言。
PL/Python is only available as an “untrusted” language, meaning it does not offer any way of restricting what users can do in it and is therefore named plpythonu
. A trusted variant plpython
might become available in the future if a secure execution mechanism is developed in Python. The writer of a function in untrusted PL/Python must take care that the function cannot be used to do anything unwanted, since it will be able to do anything that could be done by a user logged in as the database administrator. Only superusers can create functions in untrusted languages such as plpythonu
.
自行編譯原始碼的使用者必須在安裝程序中特別啟用 PL/Python 的編譯。(更多相關資訊,請參閱安裝說明。)使用預先編譯版本的使用者可以在單獨的套件中找到 PL/Python。
Generally speaking, the aim of PL/Python is to provide a “natural” mapping between the PostgreSQL and the Python worlds. This informs the data mapping rules described below.
When a PL/Python function is called, its arguments are converted from their PostgreSQL data type to a corresponding Python type:
PostgreSQL boolean
is converted to Python bool
.
PostgreSQL smallint
and int
are converted to Python int
. PostgreSQL bigint
and oid
are converted to long
in Python 2 and to int
in Python 3.
PostgreSQL real
and double
are converted to Python float
.
PostgreSQL numeric
is converted to Python Decimal
. This type is imported from the cdecimal
package if that is available. Otherwise, decimal.Decimal
from the standard library will be used. cdecimal
is significantly faster than decimal
. In Python 3.3 and up, however, cdecimal
has been integrated into the standard library under the name decimal
, so there is no longer any difference.
PostgreSQL bytea
is converted to Python str
in Python 2 and to bytes
in Python 3. In Python 2, the string should be treated as a byte sequence without any character encoding.
All other data types, including the PostgreSQL character string types, are converted to a Python str
. In Python 2, this string will be in the PostgreSQL server encoding; in Python 3, it will be a Unicode string like all strings.
For nonscalar data types, see below.
When a PL/Python function returns, its return value is converted to the function's declared PostgreSQL return data type as follows:
When the PostgreSQL return type is boolean
, the return value will be evaluated for truth according to the Python rules. That is, 0 and empty string are false, but notably 'f'
is true.
When the PostgreSQL return type is bytea
, the return value will be converted to a string (Python 2) or bytes (Python 3) using the respective Python built-ins, with the result being converted to bytea
.
For all other PostgreSQL return types, the return value is converted to a string using the Python built-in str
, and the result is passed to the input function of the PostgreSQL data type. (If the Python value is a float
, it is converted using the repr
built-in instead of str
, to avoid loss of precision.)
Strings in Python 2 are required to be in the PostgreSQL server encoding when they are passed to PostgreSQL. Strings that are not valid in the current server encoding will raise an error, but not all encoding mismatches can be detected, so garbage data can still result when this is not done correctly. Unicode strings are converted to the correct encoding automatically, so it can be safer and more convenient to use those. In Python 3, all strings are Unicode strings.
For nonscalar data types, see below.
Note that logical mismatches between the declared PostgreSQL return type and the Python data type of the actual return object are not flagged; the value will be converted in any case.
If an SQL null value is passed to a function, the argument value will appear as None
in Python. For example, the function definition of pymax
shown in Section 45.2 will return the wrong answer for null inputs. We could add STRICT
to the function definition to make PostgreSQL do something more reasonable: if a null value is passed, the function will not be called at all, but will just return a null result automatically. Alternatively, we could check for null inputs in the function body:
As shown above, to return an SQL null value from a PL/Python function, return the value None
. This can be done whether the function is strict or not.
SQL array values are passed into PL/Python as a Python list. To return an SQL array value out of a PL/Python function, return a Python list:
Multidimensional arrays are passed into PL/Python as nested Python lists. A 2-dimensional array is a list of lists, for example. When returning a multi-dimensional SQL array out of a PL/Python function, the inner lists at each level must all be of the same size. For example:
Other Python sequences, like tuples, are also accepted for backwards-compatibility with PostgreSQL versions 9.6 and below, when multi-dimensional arrays were not supported. However, they are always treated as one-dimensional arrays, because they are ambiguous with composite types. For the same reason, when a composite type is used in a multi-dimensional array, it must be represented by a tuple, rather than a list.
Note that in Python, strings are sequences, which can have undesirable effects that might be familiar to Python programmers:
Composite-type arguments are passed to the function as Python mappings. The element names of the mapping are the attribute names of the composite type. If an attribute in the passed row has the null value, it has the value None
in the mapping. Here is an example:
There are multiple ways to return row or composite types from a Python function. The following examples assume we have:
A composite result can be returned as a:Sequence type (a tuple or list, but not a set because it is not indexable)
Returned sequence objects must have the same number of items as the composite result type has fields. The item with index 0 is assigned to the first field of the composite type, 1 to the second and so on. For example:
To return a SQL null for any column, insert None
at the corresponding position.
When an array of composite types is returned, it cannot be returned as a list, because it is ambiguous whether the Python list represents a composite type, or another array dimension.Mapping (dictionary)
The value for each result type column is retrieved from the mapping with the column name as key. Example:
Any extra dictionary key/value pairs are ignored. Missing keys are treated as errors. To return a SQL null value for any column, insert None
with the corresponding column name as the key.Object (any object providing method __getattr__
)
This works the same as a mapping. Example:
Functions with OUT
parameters are also supported. For example:
Output parameters of procedures are passed back the same way. For example:
A PL/Python function can also return sets of scalar or composite types. There are several ways to achieve this because the returned object is internally turned into an iterator. The following examples assume we have composite type:
A set result can be returned from a:Sequence type (tuple, list, set)
Iterator (any object providing __iter__
and next
methods)
Generator (yield
)
Set-returning functions with OUT
parameters (using RETURNS SETOF record
) are also supported. For example:
In a procedure called from the top level or an anonymous code block (DO
command) called from the top level it is possible to control transactions. To commit the current transaction, call plpy.commit()
. To roll back the current transaction, call plpy.rollback()
. (Note that it is not possible to run the SQL commands COMMIT
or ROLLBACK
via plpy.execute
or similar. It has to be done using these functions.) After a transaction is ended, a new transaction is automatically started, so there is no separate function for that.
Here is an example:
Transactions cannot be ended when an explicit subtransaction is active.
PL/Python supports both the Python 2 and Python 3 language variants. (The PostgreSQL installation instructions might contain more precise information about the exact supported minor versions of Python.) Because the Python 2 and Python 3 language variants are incompatible in some important aspects, the following naming and transitioning scheme is used by PL/Python to avoid mixing them:
The PostgreSQL language named plpython2u
implements PL/Python based on the Python 2 language variant.
The PostgreSQL language named plpython3u
implements PL/Python based on the Python 3 language variant.
The language named plpythonu
implements PL/Python based on the default Python language variant, which is currently Python 2. (This default is independent of what any local Python installations might consider to be their “default”, for example, what /usr/bin/python
might be.) The default will probably be changed to Python 3 in a distant future release of PostgreSQL, depending on the progress of the migration to Python 3 in the Python community.
This scheme is analogous to the recommendations in PEP 394 regarding the naming and transitioning of the python
command.
It depends on the build configuration or the installed packages whether PL/Python for Python 2 or Python 3 or both are available.
The built variant depends on which Python version was found during the installation or which version was explicitly set using the PYTHON
environment variable; see Section 16.4. To make both variants of PL/Python available in one installation, the source tree has to be configured and built twice.
This results in the following usage and migration strategy:
Existing users and users who are currently not interested in Python 3 use the language name plpythonu
and don't have to change anything for the foreseeable future. It is recommended to gradually “future-proof” the code via migration to Python 2.6/2.7 to simplify the eventual migration to Python 3.
In practice, many PL/Python functions will migrate to Python 3 with few or no changes.
Users who know that they have heavily Python 2 dependent code and don't plan to ever change it can make use of the plpython2u
language name. This will continue to work into the very distant future, until Python 2 support might be completely dropped by PostgreSQL.
Users who want to dive into Python 3 can use the plpython3u
language name, which will keep working forever by today's standards. In the distant future, when Python 3 might become the default, they might like to remove the “3” for aesthetic reasons.
Daredevils, who want to build a Python-3-only operating system environment, can change the contents of pg_pltemplate
to make plpythonu
be equivalent to plpython3u
, keeping in mind that this would make their installation incompatible with most of the rest of the world.
See also the document What's New In Python 3.0 for more information about porting to Python 3.
It is not allowed to use PL/Python based on Python 2 and PL/Python based on Python 3 in the same session, because the symbols in the dynamic modules would clash, which could result in crashes of the PostgreSQL server process. There is a check that prevents mixing Python major versions in a session, which will abort the session if a mismatch is detected. It is possible, however, to use both PL/Python variants in the same database, from separate sessions.
The PL/Python language module automatically imports a Python module called plpy
. The functions and constants in this module are available to you in the Python code as plpy.
foo
.
The plpy
module provides several functions to execute database commands:plpy.execute
(query
[, max-rows
])
Calling plpy.execute
with a query string and an optional row limit argument causes that query to be run and the result to be returned in a result object.
The result object emulates a list or dictionary object. The result object can be accessed by row number and column name. For example:
returns up to 5 rows from my_table
. If my_table
has a column my_column
, it would be accessed as:
The number of rows returned can be obtained using the built-in len
function.
The result object has these additional methods:nrows
()
Returns the number of rows processed by the command. Note that this is not necessarily the same as the number of rows returned. For example, an UPDATE
command will set this value but won't return any rows (unless RETURNING
is used).status
()
The SPI_execute()
return value.colnames
()
coltypes
()
coltypmods
()
Return a list of column names, list of column type OIDs, and list of type-specific type modifiers for the columns, respectively.
These methods raise an exception when called on a result object from a command that did not produce a result set, e.g., UPDATE
without RETURNING
, or DROP TABLE
. But it is OK to use these methods on a result set containing zero rows.__str__
()
The standard __str__
method is defined so that it is possible for example to debug query execution results using plpy.debug(rv)
.
The result object can be modified.
Note that calling plpy.execute
will cause the entire result set to be read into memory. Only use that function when you are sure that the result set will be relatively small. If you don't want to risk excessive memory usage when fetching large results, use plpy.cursor
rather than plpy.execute
.plpy.prepare
(query
[, argtypes
])
plpy.execute
(plan
[, arguments
[, max-rows
]])
plpy.prepare
prepares the execution plan for a query. It is called with a query string and a list of parameter types, if you have parameter references in the query. For example:
text
is the type of the variable you will be passing for $1
. The second argument is optional if you don't want to pass any parameters to the query.
After preparing a statement, you use a variant of the function plpy.execute
to run it:
Pass the plan as the first argument (instead of the query string), and a list of values to substitute into the query as the second argument. The second argument is optional if the query does not expect any parameters. The third argument is the optional row limit as before.
Alternatively, you can call the execute
method on the plan object:
Query parameters and result row fields are converted between PostgreSQL and Python data types as described in Section 45.3.
When you prepare a plan using the PL/Python module it is automatically saved. Read the SPI documentation (Chapter 46) for a description of what this means. In order to make effective use of this across function calls one needs to use one of the persistent storage dictionaries SD
or GD
(see Section 45.4). For example:
plpy.cursor
(query
)
plpy.cursor
(plan
[, arguments
])
The plpy.cursor
function accepts the same arguments as plpy.execute
(except for the row limit) and returns a cursor object, which allows you to process large result sets in smaller chunks. As with plpy.execute
, either a query string or a plan object along with a list of arguments can be used, or the cursor
function can be called as a method of the plan object.
The cursor object provides a fetch
method that accepts an integer parameter and returns a result object. Each time you call fetch
, the returned object will contain the next batch of rows, never larger than the parameter value. Once all rows are exhausted, fetch
starts returning an empty result object. Cursor objects also provide an iterator interface, yielding one row at a time until all rows are exhausted. Data fetched that way is not returned as result objects, but rather as dictionaries, each dictionary corresponding to a single result row.
An example of two ways of processing data from a large table is:
Cursors are automatically disposed of. But if you want to explicitly release all resources held by a cursor, use the close
method. Once closed, a cursor cannot be fetched from anymore.
Do not confuse objects created by plpy.cursor
with DB-API cursors as defined by the Python Database API specification. They don't have anything in common except for the name.
Functions accessing the database might encounter errors, which will cause them to abort and raise an exception. Both plpy.execute
and plpy.prepare
can raise an instance of a subclass of plpy.SPIError
, which by default will terminate the function. This error can be handled just like any other Python exception, by using the try/except
construct. For example:
The actual class of the exception being raised corresponds to the specific condition that caused the error. Refer to Table A.1 for a list of possible conditions. The module plpy.spiexceptions
defines an exception class for each PostgreSQL condition, deriving their names from the condition name. For instance, division_by_zero
becomes DivisionByZero
, unique_violation
becomes UniqueViolation
, fdw_error
becomes FdwError
, and so on. Each of these exception classes inherits from SPIError
. This separation makes it easier to handle specific errors, for instance:
Note that because all exceptions from the plpy.spiexceptions
module inherit from SPIError
, an except
clause handling it will catch any database access error.
As an alternative way of handling different error conditions, you can catch the SPIError
exception and determine the specific error condition inside the except
block by looking at the sqlstate
attribute of the exception object. This attribute is a string value containing the “SQLSTATE” error code. This approach provides approximately the same functionality
The plpy
module also provides the functions
plpy.error
and plpy.fatal
actually raise a Python exception which, if uncaught, propagates out to the calling query, causing the current transaction or subtransaction to be aborted. raise plpy.Error(
msg
) and raise plpy.Fatal(
msg
) are equivalent to calling plpy.error(
msg
) and plpy.fatal(
msg
), respectively but the raise
form does not allow passing keyword arguments. The other functions only generate messages of different priority levels. Whether messages of a particular priority are reported to the client, written to the server log, or both is controlled by the log_min_messages and client_min_messages configuration variables. See Chapter 19 for more information.
The msg
argument is given as a positional argument. For backward compatibility, more than one positional argument can be given. In that case, the string representation of the tuple of positional arguments becomes the message reported to the client.
The following keyword-only arguments are accepted:
The string representation of the objects passed as keyword-only arguments is used to enrich the messages reported to the client. For example:
Another set of utility functions are plpy.quote_literal(
string
), plpy.quote_nullable(
string
), and plpy.quote_ident(
string
). They are equivalent to the built-in quoting functions described in Section 9.4. They are useful when constructing ad-hoc queries. A PL/Python equivalent of dynamic SQL from Example 42.1 would be:
plpy.debug(
msg, **kwargs
)
plpy.log(
msg, **kwargs
)
plpy.info(
msg, **kwargs
)
plpy.notice(
msg, **kwargs
)
plpy.warning(
msg, **kwargs
)
plpy.error(
msg, **kwargs
)
plpy.fatal(
msg, **kwargs
)
detail
hint
sqlstate
schema_name
table_name
column_name
datatype_name
constraint_name
Some of the environment variables that are accepted by the Python interpreter can also be used to affect PL/Python behavior. They would need to be set in the environment of the main PostgreSQL server process, for example in a start script. The available environment variables depend on the version of Python; see the Python documentation for details. At the time of this writing, the following environment variables have an affect on PL/Python, assuming an adequate Python version:
PYTHONHOME
PYTHONPATH
PYTHONY2K
PYTHONOPTIMIZE
PYTHONDEBUG
PYTHONVERBOSE
PYTHONCASEOK
PYTHONDONTWRITEBYTECODE
PYTHONIOENCODING
PYTHONUSERBASE
PYTHONHASHSEED
(It appears to be a Python implementation detail beyond the control of PL/Python that some of the environment variables listed on the python
man page are only effective in a command-line interpreter and not an embedded Python interpreter.)