47. Background Worker Processes
PostgreSQL can be extended to run user-supplied code in separate processes. Such processes are started, stopped and monitored by
postgres, which permits them to have a lifetime closely linked to the server's status. These processes have the option to attach to PostgreSQL's shared memory area and to connect to databases internally; they can also run multiple transactions serially, just like a regular client-connected server process. Also, by linking to libpq they can connect to the server and behave like a regular client application.
There are considerable robustness and security risks in using background worker processes because, being written in the
Clanguage, they have unrestricted access to data. Administrators wishing to enable modules that include background worker processes should exercise extreme caution. Only carefully audited modules should be permitted to run background worker processes.
Background workers can be initialized at the time that PostgreSQL is started by including the module name in
shared_preload_libraries. A module wishing to run a background worker can register it by calling
worker) from its
_PG_init()function. Background workers can also be started after the system is up and running by calling
RegisterBackgroundWorker, which can only be called from within the postmaster process,
RegisterDynamicBackgroundWorkermust be called from a regular backend or another background worker.
BackgroundWorkeris defined thus:
typedef void (*bgworker_main_type)(Datum main_arg);
typedef struct BackgroundWorker
int bgw_restart_time; /* in seconds, or BGW_NEVER_RESTART */
bgw_typeare strings to be used in log messages, process listings and similar contexts.
bgw_typeshould be the same for all background workers of the same type, so that it is possible to group such workers in a process listing, for example.
bgw_nameon the other hand can contain additional information about the specific process. (Typically, the string for
bgw_namewill contain the type somehow, but that is not strictly required.)
bgw_flagsis a bitwise-or'd bit mask indicating the capabilities that the module wants. Possible values are:
Requests shared memory access. Workers without shared memory access cannot access any of PostgreSQL's shared data structures, such as heavyweight or lightweight locks, shared buffers, or any custom data structures which the worker itself may wish to create and use.
Requests the ability to establish a database connection through which it can later run transactions and queries. A background worker using
BGWORKER_BACKEND_DATABASE_CONNECTIONto connect to a database must also attach shared memory using
BGWORKER_SHMEM_ACCESS, or worker start-up will fail.
bgw_start_timeis the server state during which
postgresshould start the process; it can be one of
BgWorkerStart_PostmasterStart(start as soon as
postgresitself has finished its own initialization; processes requesting this are not eligible for database connections),
BgWorkerStart_ConsistentState(start as soon as a consistent state has been reached in a hot standby, allowing processes to connect to databases and run read-only queries), and
BgWorkerStart_RecoveryFinished(start as soon as the system has entered normal read-write state). Note the last two values are equivalent in a server that's not a hot standby. Note that this setting only indicates when the processes are to be started; they do not stop when a different state is reached.
bgw_restart_timeis the interval, in seconds, that
postgresshould wait before restarting the process, in case it crashes. It can be any positive value, or
BGW_NEVER_RESTART, indicating not to restart the process in case of a crash.
bgw_library_nameis the name of a library in which the initial entry point for the background worker should be sought. The named library will be dynamically loaded by the worker process and
bgw_function_namewill be used to identify the function to be called. If loading a function from the core code, this must be set to "postgres".
bgw_function_nameis the name of a function in a dynamically loaded library which should be used as the initial entry point for a new background worker.
Datumargument to the background worker main function. This main function should take a single argument of type
bgw_main_argwill be passed as the argument. In addition, the global variable
MyBgworkerEntrypoints to a copy of the
BackgroundWorkerstructure passed at registration time; the worker may find it helpful to examine this structure.
On Windows (and anywhere else where
EXEC_BACKENDis defined) or in dynamic background workers it is not safe to pass a
Datumby reference, only by value. If an argument is required, it is safest to pass an int32 or other small value and use that as an index into an array allocated in shared memory. If a value like a
textis passed then the pointer won't be valid from the new background worker process.
bgw_extracan contain extra data to be passed to the background worker. Unlike
bgw_main_arg, this data is not passed as an argument to the worker's main function, but it can be accessed via
MyBgworkerEntry, as discussed above.
bgw_notify_pidis the PID of a PostgreSQL backend process to which the postmaster should send
SIGUSR1when the process is started or exits. It should be 0 for workers registered at postmaster startup time, or when the backend registering the worker does not wish to wait for the worker to start up. Otherwise, it should be initialized to
Once running, the process can connect to a database by calling
uint32 flags) or
uint32 flags). This allows the process to run transactions and queries using the
dbnameis NULL or
InvalidOid, the session is not connected to any particular database, but shared catalogs can be accessed. If
usernameis NULL or
InvalidOid, the process will run as the superuser created during
BGWORKER_BYPASS_ALLOWCONNis specified as
flagsit is possible to bypass the restriction to connect to databases not allowing user connections. A background worker can only call one of these two functions, and only once. It is not possible to switch databases.
Signals are initially blocked when control reaches the background worker's main function, and must be unblocked by it; this is to allow the process to customize its signal handlers, if necessary. Signals can be unblocked in the new process by calling
BackgroundWorkerUnblockSignalsand blocked by calling
bgw_restart_timefor a background worker is configured as
BGW_NEVER_RESTART, or if it exits with an exit code of 0 or is terminated by
TerminateBackgroundWorker, it will be automatically unregistered by the postmaster on exit. Otherwise, it will be restarted after the time period configured via
bgw_restart_time, or immediately if the postmaster reinitializes the cluster due to a backend failure. Backends which need to suspend execution only temporarily should use an interruptible sleep rather than exiting; this can be achieved by calling
WaitLatch(). Make sure the
WL_POSTMASTER_DEATHflag is set when calling that function, and verify the return code for a prompt exit in the emergency case that
postgresitself has terminated.
When a background worker is registered using the
RegisterDynamicBackgroundWorkerfunction, it is possible for the backend performing the registration to obtain information regarding the status of the worker. Backends wishing to do this should pass the address of a
BackgroundWorkerHandle *as the second argument to
RegisterDynamicBackgroundWorker. If the worker is successfully registered, this pointer will be initialized with an opaque handle that can subsequently be passed to
pid_t *) or
GetBackgroundWorkerPidcan be used to poll the status of the worker: a return value of
BGWH_NOT_YET_STARTEDindicates that the worker has not yet been started by the postmaster;
BGWH_STOPPEDindicates that it has been started but is no longer running; and
BGWH_STARTEDindicates that it is currently running. In this last case, the PID will also be returned via the second argument.
TerminateBackgroundWorkercauses the postmaster to send
SIGTERMto the worker if it is running, and to unregister it as soon as it is not.
In some cases, a process which registers a background worker may wish to wait for the worker to start up. This can be accomplished by initializing
MyProcPidand then passing the
BackgroundWorkerHandle *obtained at registration time to
pid_t *) function. This function will block until the postmaster has attempted to start the background worker, or until the postmaster dies. If the background worker is running, the return value will be
BGWH_STARTED, and the PID will be written to the provided address. Otherwise, the return value will be
A process can also wait for a background worker to shut down, by using the
BackgroundWorkerHandle *handle) function and passing the
BackgroundWorkerHandle *obtained at registration. This function will block until the background worker exits, or postmaster dies. When the background worker exits, the return value is
BGWH_STOPPED, if postmaster dies it will return
If a background worker sends asynchronous notifications with the
NOTIFYcommand via the Server Programming Interface (SPI), it should call
ProcessCompletedNotifiesexplicitly after committing the enclosing transaction so that any notifications can be delivered. If a background worker registers to receive asynchronous notifications with the
LISTENthrough SPI, the worker will log those notifications, but there is no programmatic way for the worker to intercept and respond to those notifications.
src/test/modules/worker_spimodule contains a working example, which demonstrates some useful techniques.