docs/RFCS/20220706_user_defined_functions.md
User-defined functions, or UDFs, are named functions defined at the database layer that can be called in queries and other contexts. UDFs allow users to compose complex queries from reusable functions and to bring computation to data to reduce round-trips. Several popular tools, including Hasura and PostgREST, rely on UDFs. Furthermore, UDFs are a stepping stone toward implementing triggers.
The path to feature-parity with Postgres UDFs is arduous because of the flexibility Postgres allows in creating and using UDFs. In Postgres, nearly all types of statements are allowed within a UDF body. A UDF can mutate data with INSERT/UPDATE/DELETE statements, make schema changes, set session settings, call other UDFs, and recursively call itself. Functions can be written in several different languages including SQL, PL/pgSQL, PL/Tcl, PL/Perl, PL/Python, and C. UDFs can be called in a variety of contexts like WHERE clauses, FROM clauses, CHECK constraint expressions, and DEFAULT expressions. To safely give users the same level of freedom, we must prudently consider and test the complex interactions between all these features and UDFs.
This RFC proposes an initial implementation of UDFs. It lays down fundamental machinery and execution support for basic SQL UDFs. It does not cover in detail many features of UDFs provided by Postgres, and many of the freedoms described above will be prohibited initially. The specific features we aim to initially support are detailed in the Features and Syntax section, and some of the remaining features are discussed in Future Work.
All UDFs have several basic components including a name, a list of arguments and their types, a return type, a function body and a language. For example, below is a UDF that returns the sum of two integers:
CREATE FUNCTION add(a INT, b INT) RETURNS INT LANGUAGE SQL AS $$
SELECT a + b;
$$;
Function arguments have a type, mode, and optional default value. The type can
be a built-in type, a composite type, an implicit record type (see
#70100), a reference to
the type of a table's column (with the syntax table_name.column_name%TYPE), or
an arbritrary data type (see Postgres's documentation on polymorphic
functions).
We will initially support only built-in types, user-defined enums, and
implicit record types. Argument modes include the default IN, INOUT,
OUT, and VARIADIC. The "out" variants mark an "argument" as a return
parameter and cannot be used when specifying a RETURNS type. VARIADIC allows
defining a function with variable arity. We will initially support only the
IN argument mode. DEFAULT values for arguments can be specified for IN
and INOUT arguments. We will not initially support DEFAULT values for
arguments.
RETURNS specifies the return type of the function, which can be a built-in
type, a composite type, or a reference to the type of a table's column. The
SETOF modifier allows returning multiple rows rather than a single value. The
TABLE modifier allows returning a table with multiple columns and rows.
RETURNS VOID indicates that there is no return type and NULL will always be
returned. We will initially support returning built-in types, user-defined
enums, implicit record types, SETOF, and VOID. We will be able to support
TABLE once we are able to support user defined record types (tracked by issue
27792).
The function body can reference arguments by name, or by their ordinal in the
function definition with the syntax $1, like in a prepared statement. If the
return type of the function is not VOID, the last statement of a UDF must be a
SELECT, or an INSERT, UPDATE, UPSERT, or DELETE with a RETURNING
clause. Postgres allows most types of statements within a UDF body. Notable
exceptions are transaction control commands like COMMIT and SAVEPOINT, and
utility commands like VACUUM. Our initial implementation will support
SELECT statements, and we hope to support mutations (INSERT, UPDATE,
UPSERT, DELETE). Until we can address a known issue that also affects CTEs
with mutations, we will
not allow the same table to be mutated from multiple statements in a UDF, in
multiple UDFs, or in the calling statement and a UDF. We will not initially
support other statements including schema changes (see DDL (Schema Changes) in
UDFs), session settings, or creating table
statistics.
Postgres allows UDFs to be invoked in many contexts including SELECT clauses,
WHERE clauses, FROM clauses, and table expressions (DEFAULT column
expressions, computed column expressions, partial index predicates, and
expression indexes). We will initially support invoking UDFs only in SELECT,
FROM, and WHERE clauses of DML statements.
Initially we will disallow recursive functions and function cycles (see Recursion and Function Cycles).
Postgres supports writing UDFs in a handful of procedural languages. We will initially support only SQL UDFs.
Other options that Postgres supports are shown below. Only a few will be initially supported.
| Field | Initial Support? | Details |
|---|---|---|
TRANSFORM [ FOR TYPE ] | No | Transforms are used to convert between SQL types and procedural languages types. |
WINDOW | No | Indicates that C function is a window function. |
IMMUTABLE | STABLE | VOLATILE | Yes | Indicates the volatility of a function (see Volatility.V). The default is VOLATILE. Functions with side-effects must be VOLATILE to prevent the optimizer from folding the function. A VOLATILE function will see mutations to data made by the SQL command calling the function, while a STABLE or IMMUTABLE function will not (see the Optimization and Execution section and Appendix A for more details). |
[ NOT ] LEAKPROOF | Yes | Indicates that a function has no side effects and that it communicates nothing that is dependent on its arguments besides its return value (e.g., it cannot throw an error). We currently consider LEAKPROOF as a separate type of volatility, rather than a volatility qualifier. Postgres allows STABLE LEAKPROOF functions, but we do not currently have the ability to represent this volatility. Until we decouple LEAKPROOF from volatility, we will only allow LEAKPROOF with IMMUTABLE which will map to Volatility.Leakproof. NOT LEAKPROOF will be allowed with any volatility and have no effect. |
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT | STRICT | Yes | CALLED ON NULL INPUT indicates that the function will be executed if any of the arguments is NULL. RETURNS NULL ON NULL INPUT or STRICT indicates that the function always evaluates to NULL when any of its arguments are NULL, and it will not be executed. This maps to NullableArgs. We will likely need to move NullableArgs from FunctionProperties to Overload because it is specific to an overload. |
[ EXTERNAL ] SECURITY INVOKER | [ EXTERNAL ] SECURITY DEFINER | No | Indicates the privileges to run the function with. The default is SECURITY INVOKER which means that the function is executed with the privileges of the user that invokes it. We will not initially support SECURITY DEFINER. |
PARALLEL [ UNSAFE | RESTRICTED | SAFE ] | No | Indicates whether or not the function is safe to execute in parallel. The default is PARALLEL UNSAFE which means that the functions cannot be executed in parallel. This is the only option we will initially support (see the Optimization and Execution section for why distribution of non-inlined UDFs is not possible). |
COST execution_cost | No | The estimated execution cost of a function that aids in query optimization. Functions that are inlined (see the Optimization and Execution section) will not need this estimate. Before we can support this, we'll need to decide on a unit that makes sense to users. |
ROWS result_rows | No | The estimated number of rows returned by the function that aids in query optimization. Only allowed when the function returns a set. The default is 1000 rows. Functions that are inlined (see the Optimization and Execution section) will not need this estimate. |
SUPPORT FUNCTION | No | Allows use of a planner support function, written in C, that supplies additional information to the optimizer to help with query planning. |
SET configuration_parameter { TO value | = value | FROM CURRENT } | No | Allows a session variable to be set for the duration of the function. |
sql_body | No | Allows writing the function body as an expression rather than a string constant. It behaves differently to the string constant form. It only is allowed for SQL functions and dependencies of objects referenced in the function are tracked, which prevents a table, column, function, etc. rename from breaking a function or schema. The Postgres docs note that "this form is more compatible with the SQL standard and other SQL implementations." |
Resolution contains two steps: Function Resolution and Overload Resolution.
The goal is to return the best matched function overload given a function
expression. Simply put, Function Resolution returns a list of candidate
overloads given a function name and current search path, then Overload
Resolution filters the candidate overloads down to one single overload given the
function expression arguments and an optional return type. Function Resolution
can be used to access function properties, and Overload Resolution is performed
to determine the type of a function expression. CRDB currently has an
implementation of function resolution logic and overload filtering heuristics
for builtin functions. However, with user defined functions introduced, things
are getting a little bit more complicated.
Note: It seems weird to distinguish the word function from overload. They
“are” different concepts and it helps to draw the line between the two steps
above. But it also makes it feel like there’s a hierarchy that a function
represents a group of overloads. Just want to clarify that there is just one
level representation: overloads. Moving forward, the two words might be
mix-used. But just keep in mind that the goal is get one single overload.
There will be two paths of resolutions:
Only builtin functions are resolved in this path. This is needed to satisfy
current usage of tree.WrapFunction function which assumes the input string is
a builtin function name. It’s important because there are quite a bunch of SQL
statements which are parsed directly into builtin functions calls. So we need
this as a shortcut for builtin function resolution. Namely, in this path, no
user defined functions are considered. It is also required by CDC filters which
only consider builtins and CDC specific overrides.
The desired behavior is:
This includes both builtin functions and UDFs.
Given a qualified function name, only functions in the specified schema are
considered. This is also true for builtin functions under pg_catalog schema.
If an unqualified function name is provided, all schemas on the current
search_path will be searched for the function and the earliest best matched
qualified function is returned. That is:
If there are multiple functions with an identical signature on the search
path that match the target, the earliest one found is returned. This implies
that builtin functions are shadowed if pg_catalog schema is specified in
the current search path but there’s a function with identical signature. On
the flip side, builtin functions win if pg_catalog is not in the search
path.
Example (1): earlier schema in search path wins
Target Function: f(TypeA) returns TypeB
Search Path: sc1, sc2
Candidates in schema sc1: f(TypeA) returns TypeB
Candidates in schema sc2: f(TypeA) returns TypeB
Candidates in pg_catalog: None
Resolution result: sc1.f
Example (2): pg_catalog builtin wins when not specified in search path
Target Function: f(TypeA) returns TypeB
Search Path: sc1
Candidates in schema sc1: f(TypeA) returns TypeB
Candidates in pg_catalog: f(TypeA) returns TypeB
Resolution result: pg_catalog.f
Example (3): pg_catalog builtin is shadowed when it’s later in search path
Target Function: f(TypeA) returns TypeB
Search Path: sc1, pg_catalog
Candidates in schema sc1: f(TypeA) returns TypeB
Candidates in pg_catalog: f(TypeA) returns TypeB
Resolution result: sc1.f
Similar to (1), but function signatures are not identical, however they equally match the target, the earliest match is returned.
Example (1): earlier schema in search path wins
Target Function: f(TypeA) returns TypeB
Search Path: sc1, sc2
Candidates in schema sc1: f(TypeC) returns TypeB // TypeC constant can become
TypeA
Candidates in schema sc2: f(TypeD) returns TypeB // TypeD constant can become TypeA, and equally important as TypeC based on CRDB’s heuristics.
Resolution result: sc1.f
Always return the best match even if there’s a compatible match earlier in the search path.
Example (1): Better match in later schema wins
Target Function: f(TypeA) returns TypeB
Search Path: sc1, sc2
Candidates in schema sc1: f(TypeC) returns TypeB // TypeC constant can become TypeA
Candidates in schema sc2: f(TypeA) returns TypeB // Note that this is a exact match
Candidates in pg_catalog: None
Resolution result: sc2.f // It’s the better match even though sc1.f is compatible.
Example (2): Better match in same schema wins
Target Function: f(TypeA) returns TypeB
Search Path: sc1,
Candidates in schema sc1: f(TypeC) returns TypeB, // TypeC constant can become TypeA
f(TypeA) returns TypeB,
Candidates in pg_catalog: None
Resolution result: sc1.f(TypeA)
Ambiguity Error is returned if there are multiple equally matched candidates in the earliest schema on the search path. That is, we failed to narrow down to one single overload.
Note that all functions in crdb_internal are exceptions here. They need to
be called with a qualified function name. Namely, function name should be
always prefixed with the crdb_internal schema name.
CRDB’s current implementation of function overload resolution considers all overloads equally beginning with stricter heuristics (e.g., input argument types exactly match overload argument types ) followed by relaxed heuristics (e.g., input arguments can be casted to match overload argument types). It tries to find the best match. However, the position of the schema in the search path is not considered. We may reuse this logic, but extend it to also check schema position.
One thing to note is that, currently there are multiple places in CRDB trying to
access function properties to make decisions about next steps. And in some cases
function expressions are not type checked yet. That is, only the Function Resolution step is performed and we only assume that there is a group of
overloads which share function properties, so that it’s safe to use it even if
these overloads are not resolved and narrowed down to a single choice. This is
fine when we only have builtin functions, but this is not true anymore with
UDFs. We want to throw an ambiguity error here as well if there’s any function
properties conflicts among the overload candidates.
Based on the fact that UDFs can have the same name as tables/schema/databases,
we choose not to give UDFs namespace entries. Namely, we won’t write any record
into the system.namespace table. Instead, we will store UDF signatures in the
schema descriptor, so that name resolution can be done efficiently without
reading the system.namespace table. To achieve this, a mapping from function
name to a list of overload signatures, which includes argument type OIDs, return
type OIDs, and UDF descriptor IDs, will be stored in the schema descriptor (see
the Descriptors section below). Schema descriptor versions need
to be bumped when adding/removing/renaming UDFs to make sure future function
resolution works properly with the new UDF set. With that, UDF resolution is as:
This also implies a plan that UDF descriptors will skip the by name cache in
all layers of the descriptor collection which act as a descriptor cache for each
session. The by name cache is used to fetch a descriptor by name quickly in a
same session. It is indexed by namespace entry (parent_db_id, parent_schema_id, name).
Since there won't be any namespace entry for UDF, we shouldn't write to this
cache. Writing to this cache would also cause conflicts with tables within a
same schema if they share a name.
These should be good enough to retrieve UDF descriptors. However a UDF
descriptor is not enough for function expression evaluation based on Cockroach’s
current implementation. We need to translate the UDF descriptor into a
tree.FunctionDefinition and hydrate any user defined types. This means that a
cache of translated and hydrated UDF definitions is needed for fast name
resolutions. We’ll need to generalize the existing hydratedtables.Cache or add
something similar to it.
Reference tracking is important for correctness. Postgres 14 added a new
sql_body feature which requires a UDF function body to be stored as an AST
with referenced objects annotated with OIDs. This allows objects referenced in
the UDF body to be renamed without breaking the UDF. CRDB currently doesn’t have
the same infrastructure or similar to guarantee such level of correctness. In
the short term, we propose to disallow renaming objects referenced in the UDF
body. Such that, in the first version, reference tracking includes:
Examples:
Outbound reference
With UDF definition f:
CREATE FUNCTION f() RETURNS u_enum LANGUAGE SQL AS $$
SELECT a FROM t@idx
$$
We’ll record in table t’s descriptor that column a and index idx are
referenced by UDF f. In f’s descriptor, we’ll record that f depends
on table t and type u_enum.
Inbound reference
With table definition t:
CREATE TABLE t (
id INT PRIMARY KEY,
a INT NOT NULL CHECK (f(a) > 100), -- this creates a check constraint “check_a”
b INT NOT NULL DEFAULT f(a),
INDEX idx (a) WHERE f(a) > 100
)
We’ll record in table t’s descriptor that it depends on UDF f. In
f’s descriptor, we’ll record that f is referenced by column b,
constraint check_a and index idx of table t.
Reference tracking requires objects to be resolved and type checked at function
definition time which happens after SQL statements are fully understood by CRDB.
Currently these are done by optBuilder. One existing example is the CREATE VIEW statement which also requires the view query to be understood and
referenced objects in it to be resolved, it goes through optBuilder which
builds all view dependencies. Similarly the CREATE FUNCTION statement will
flow through the same path as CREATE VIEW instead of the vanilla planOpaque
path.
Reference tracking will also be used to block DROP STRICT operations on
objects referenced by UDF or UDF referenced by other objects. It's also useful
as a dependency graph when executing DROP CASCADE on a UDF or other objects,
so that a column/index can be dropped properly when a referenced UDF is dropped
and a UDF is dropped properly when a table referenced in the body is dropped.
Binding references to tables, view, other functions, types, etc. in the function body at definition-time will require rewriting function bodies to use fully qualified names. This will prevent referenced objects from being renamed, but we plan to lift this limitation in the future by rewriting references as OIDs instead of fully qualified names (see the Function Body Rewrite section in Future Work).
The early-binding of references in function bodies is a deviation from
Postgres's behavior in which references in function bodies are resolved at
execution-time (except when a function body is defined with the non-string
sql_body syntax). We differ from Postgres here because late-binding makes
users vulnerable to index corruption (see Appendix D). However,
we realize that some users transitioning from Postgres will rely on late-binding
semantics, so we will add a udf_early_binding session setting, defaulted to
"on", that can be set to "off" to disable reference tracking and rewriting
references in function bodies as fully-qualified names.
Function privilege type will be introduced and EXECUTE privilege will be
added. And UDF will follow the following rules:
CREATE privilege on a schema to create a function under
it.USAGE privilege on a schema to resolve functions
under it.EXECUTE privilege on a function to call the function.USAGE privilege on a user defined type to define a
function with the type.EXECUTE privilege can be granted at database level as default privileges
and new function created inherit privileges from the database level.SHOW GRANTS ON FUNCTION support will be added for UDFs to list out current
UDF privileges. And information_schema.role_routine_grants table needs to be
populated properly to be compatible with postgres.
message SchemaDescriptor {
// ... other fantastic fields of schema descriptor
message FunctionOverload {
option (gogoproto.equal) = true;
optional uint32 id = 1 [(gogoproto.nullable) = false,
(gogoproto.customname) = "ID", (gogoproto.casttype) = "ID"];
repeated uint32 arg_types = 2 [(gogoproto.nullable) = false, (gogoproto.casttype) = "github.com/lib/pq/oid.Oid"];
optional uint32 return_type = 3 [(gogoproto.nullable) = false, (gogoproto.casttype) = "github.com/lib/pq/oid.Oid"];
optional bool return_set = 4 [(gogoproto.nullable) = false];
}
message Function {
option (gogoproto.equal) = true;
optional string name = 1 [(gogoproto.nullable) = false];
repeated FunctionOverload overloads = 2 [(gogoproto.nullable) = false];
}
repeated Function functions = 13 [(gogoproto.nullable) = false];
// Next field is 14.
}
message FunctionDescriptor {
option (gogoproto.equal) = true;
// Needed for the descriptorProto interface.
option (gogoproto.goproto_getters) = true;
enum Volatility {
Immutable = 0;
Stable = 1;
Volatile = 2;
}
enum NullInputBehavior {
CalledOnNullInput = 0;
ReturnsNullOnNullInput = 1;
Strict = 2;
}
enum Language {
Sql = 0;
}
message Argument {
option (gogoproto.equal) = true;
enum Class {
In = 0;
Out = 1;
InOut = 2;
Variadic = 3;
}
optional Class class = 1 [(gogoproto.nullable) = false];
optional string name = 2 [(gogoproto.nullable) = false];
optional sql.sem.types.T type = 3;
optional string default_expr = 4;
}
optional string name = 1 [(gogoproto.nullable) = false];
optional uint32 id = 2 [(gogoproto.nullable) = false,
(gogoproto.customname) = "ID", (gogoproto.casttype) = "ID"];
optional uint32 parent_id = 3 [(gogoproto.nullable) = false,
(gogoproto.customname) = "ParentID", (gogoproto.casttype) = "ID"];
optional uint32 parent_schema_id = 4 [(gogoproto.nullable) = false,
(gogoproto.customname) = "ParentSchemaID", (gogoproto.casttype) = "ID"];
repeated Argument args = 5 [(gogoproto.nullable) = false];
message ReturnType {
option (gogoproto.equal) = true;
optional sql.sem.types.T type = 1;
// Whether it is returning a set of values
optional bool is_set = 2 [(gogoproto.nullable) = false];
}
optional ReturnType return_type = 6 [(gogoproto.nullable) = false];;
optional Language lang = 7 [(gogoproto.nullable) = false];
// Queries in this string are rewritten with fully qualified table names.
optional string function_body = 8 [(gogoproto.nullable) = false];
optional Volatility volatility = 9 [(gogoproto.nullable) = false];
optional bool leak_proof = 10 [(gogoproto.nullable) = false];
optional NullInputBehavior null_input_behavior = 11 [(gogoproto.nullable) = false];
optional PrivilegeDescriptor privileges = 12;
repeated uint32 depends_on = 13 [(gogoproto.casttype) = "ID"];
repeated uint32 depends_on_types = 14 [(gogoproto.casttype) = "ID"];
message Reference {
option (gogoproto.equal) = true;
// The ID of the relation that depends on this function.
optional uint32 id = 1 [(gogoproto.nullable) = false,
(gogoproto.customname) = "ID", (gogoproto.casttype) = "ID"];
// If applicable, IDs of the inbound reference table's index.
repeated uint32 index_ids = 2 [(gogoproto.nullable) = false,
(gogoproto.customname) = "IndexIDs", (gogoproto.casttype) = "IndexID"];
// If applicable, IDs of the inbound reference table's column.
repeated uint32 column_ids = 3 [(gogoproto.customname) = "ColumnIDs",
(gogoproto.casttype) = "ColumnID"];
// If applicable, IDs of the inbound reference table's constraint.
repeated uint32 constraint_ids = 4 [(gogoproto.customname) = "ConstraintIDs",
(gogoproto.casttype) = "ConstraintID"];
}
repeated Reference depended_on_by = 15 [(gogoproto.nullable) = false];
optional DescriptorState state = 16 [(gogoproto.nullable) = false];
optional string offline_reason = 17 [(gogoproto.nullable) = false];
optional uint32 version = 18 [(gogoproto.nullable) = false, (gogoproto.casttype) = "DescriptorVersion"];
optional util.hlc.Timestamp modification_time = 19 [(gogoproto.nullable) = false];
// DeclarativeSchemaChangerState contains the state corresponding to the
// descriptor being changed as part of a declarative schema change.
optional cockroach.sql.schemachanger.scpb.DescriptorState declarative_schema_changer_state = 20;
// Next field id is 21
}
Currently we reference functions by names in all column/index/constraint
expressions used within a table. Namely, we store expressions as a string in
which function calls are represented in the format of func_name(params..).
This is good enough for builtin functions which are the only functions we
currently have, but it would be problematic for UDFs since they can be renamed.
It’s necessary that function references are tracked by non-name identifiers in
expressions.
To address that, OIDs will be assigned to UDF as Descriptor ID + 100k, this is
the same way as we assign OIDs to user defined enum types nowadays. For
builtins, currently OID (used for ::REGPROC casting and pg_proc table) is
assigned at init time by hashing on builtin function signatures, this is
unfortunately a obstacle in our way because the hash OID might conflict with any
UDF descriptor OIDs. To clean the way, fixture OIDs will be assigned to builtin
functions. Similar to OIDs of builtin data types, builtin function OIDs will
start from 1 and less than 100k (which should be enough). This won’t break
anything of CRDB internally, but hopefully no customer depends on the current
OIDs generated from hashing.
However, one downside of this approach is that we need to manually give each builtin an OID which is quite tedious. Instead we could utilize the builtin registration at init time to auto assign OIDs with a counter. One problem came up is that we might move the builtin definitions around between files and OIDs could change because it's hard to tell which definition goes first at init time. So referencing builtin functions by OIDs in expressions would fail. One compromise we could make is that we reference builtin functions by name in expression, but reference UDF functions by OIDs. This seems inconsistent, but would save a lot of pain.
To make OID function reference parsing easier in a SQL expression, we propose
a new SQL token OID_FUNC_REF in the SQL parser. Which essentially represents
a string in the format of @Fxxx where xxx is an OID integer. Choosing this
syntax instead of using existing syntax for Table (e.g. oid(1, 2)) and type
(e.g. @oid) mainly because we want to avoid any possible ambiguities for when
parsing statement or expressions. It makes it easy to know what it represents
by simply looking at the syntax. It also allows opportunities for the existing
syntax to expand for new use cases as well. Choosing @ because it's an obvious
symbol character which is not an operator used by any expression. Following by
an F make it easy to be interpreted as a Function. This also analogous to
type's syntax. With that, the current SQL token scanner needs to be extended
to recognize the two character string @F and scan an oid number after, then
set token id to OID_FUNC_REF. We also need a new implementation of the
tree.FunctionReference interface, which essentially is just an OID, to
represent resolvable function OIDs. This would be the value type of the new
OID_FUNC_REF token.
UDF bodies should be validated at definition time. All SQL statements in the
function body should be valid SQL strings that can be parsed and executed sanely
under the current database and schema search path. This lays out the requirement
that we need to at least be able to build a memo (not necessarily optimized)
for each of the statements, so that we’re sure that all referenced objects
(table, function and etc.) and subobjects (column, index and etc.) are valid.
The optimizer should also be educated about the argument names so that it can
distinguish arguments from table columns in different scope of SQL statements
(see the Optimization and Execution section for
details). Lastly, the result of the last statement must match the return type of
the function.
We will use the optimizer and the resulting memo to perform this validation.
optbuilder already performs many validations for SQL statements, including
object resolution, and we can extract the result type of the last statement from
the memo.
We aim to validate that the user-given volatility matches the actual volatility of the statements of the function body. Postgres does not perform this validation (see Appendix B), allowing users to do things like create a computed column with a UDF that is marked as immutable but is actually volatile. This is dangerous because it can corrupt indexes and cause incorrect query results. We propose differing from Postgres's behavior to provide additional safety for our users.
CRDB currently allows users to back up and restore a full cluster, an individual database, or an individual table. We need to make sure UDF descriptors are properly handled during backup and restore similar to current handling of views and sequences. A UDF has similarities to both views and sequences:
Backup of a full cluster and a full database are quite straightforward for UDFs. Just back up everything. However, when doing backup of an individual table, similarly to user defined enum types, we need to also back up UDFs referenced by the table. We won’t go into the rabbit hole of backing up all referenced tables in function bodies and all referenced objects by those tables. This would be at users’ risk to back up individual tables. This is true nowadays, for example, for foreign key referenced tables as well. Restoring from such backup fails because the foreign key table cannot be found in the backup manifest.
Restoring is more interesting because we need to carefully validate that:
Several components within the optimizer will require changes in order to plan
queries with UDFs. optbuilder must convert a UDF reference into an optimizer
expression, execbuilder must convert the optimizer expression into an
evaluate-able tree.TypedExpr or exec.Plan, and normalization rules will
inline UDF function bodies directly into optimizer expressions when possible.
When optbuilder encounters a function reference that resolves to a UDF, it
will construct a new optimizer expression called a "routine" (this name is used
to differentiate a UDF from the existing "function" expression and matches the
nomenclature used in PG's DROP ROUTINE which drops a
function or a stored procedure). The routine expression will contain a list of
relational expressions representing each statement in the UDF body. It will also
contain a list of scalar expressions representing the arguments to the UDF, and
metadata like the UDF's name, argument names, volatility, and return type.
Argument references within a UDF body will be built into the memo as "routine
variables" which track the ordinal of the argument they reference. We
deliberately do not use existing placeholder expressions used for prepared
statements to avoid incorrectly replacing a routine variable with a prepared
statement value. For example, in a prepared statement that calls a UDF, there
may be two separate $1 variables that need to be replaced with different
values; one for the prepared statement argument and one for the UDF argument.
Names of columns in the given SQL statement take precedence over UDF arguments
of the same name. Therefore, when resolving a variable reference, the current
scope is searched before searching the names of arguments in a UDF. A UDF
argument can be qualified with the function name to avoid this:
function_name.argument_name. If this conflicts with a qualified column name,
the column name takes precedence.
It may be required to create two types of routine expressions: scalar routines which return a single value, and relational routines which return multiple rows. This detail will be ironed out during implementation.
After optimization is complete, execbuilder will transform a scalar routine
into a tree.TypedExpr that can be evaluated during execution. This typed
expression will contain a planning closure that returns an optimized exec.Plan
for each statement in the UDF when given the argument values. Similarly, a
relational routine will be converted into a execPlan that contains a planning
closure for planning each statement in the UDF based on argument values.
When a UDF is evaluated, the planning closure will be invoked for each statement
in the UDF. The closure will replace argument references with constant values
and then optimize the relational expression before constructing an exec.Plan.
The exec.Plan for each statement will be executed sequentially, and the
results of the last statement will be returned.
This approach, which is similar to the pattern used for the execution of apply-joins, will preclude the distribution of queries with UDFs. Any query plan with a UDF cannot be distributed because the planning closure exists in memory on the gateway node. It may be possible in the future to partially distribute query plans with UDFs (see Distributing Plans with UDFs). Note that inlined UDFs (see Inlining) can be distributed as long as the operators they are transformed into are distributable.
The volatility of a function has an important effect on the visibility of
mutations made by the statement calling the UDF. These mutations will be visible
to VOLATILE functions because it effectively starts a new snapshot for every
statement executed within it. STABLE and IMMUTABLE functions do not see
mutations made by their calling statement. They see a snapshot of the data as of
the start of the query. See Appendix A for examples.
The sql.planner struct will need to configure transaction stepping for
VOLATILE functions, and reset the read sequence number to the original
transaction sequence number once evaluation of the UDF is complete.
Some UDF bodies can be inlined as relational expressions directly into the query plan. This will improve the performance of executing a UDF because it will avoid the overhead of the default method of execution mentioned above. Additionally, it will allow for better row count and cost estimates, which will lead to more efficient query plans in some cases. Because inlining should always yield a more performant query, it can be implemented as a normalization rule.
The initial requirements for inlining a UDF are:
Postgres has additional requirements for inlining which we may need to heed.
A scalar UDF returning a single value can be inlined directly into a projection
or filter condition. For example, in the query below, add(a, b) can be inlined
directly into the filter:
CREATE TABLE t (
k INT PRIMARY KEY,
a INT,
b INT
);
CREATE FUNCTION add(x INT, y INT) RETURNS INT LANGUAGE SQL AS $$
SELECT x + y;
$$;
SELECT k FROM t WHERE add(a, b) = 100;
The query plan would be simplified to:
project
├── columns: k:1
└── select
├── columns: k:1 a:2 b:3
├── scan t
│ └── columns: k:1 a:2 b:3
└── filters
└── (a:2 + b:3) = 100
More complex scalar UDFs that read from tables can also be inlined. For example,
in the query below, max_s_k(a) can be inlined as a correlated subquery:
CREATE TABLE t (k INT PRIMARY KEY, a INT);
CREATE TABLE s (k INT PRIMARY KEY, a INT);
CREATE FUNCTION max_s_k(t_a INT) RETURNS INT LANGUAGE SQL AS $$
SELECT max(k) FROM s WHERE s.a = t_a;
$$;
SELECT * FROM t WHERE k = max_s_k(a);
After inlining, the query plan would look similar to the plan below. Existing optimization rules would further transform the subquery into a join.
project
├── columns: k:1!null a:2
└── select
├── columns: t.k:1!null t.a:2
├── scan t
│ └── columns: t.k:1!null t.a:2
└── filters
└── eq [outer=(1,2), correlated-subquery]
├── t.k:1
└── subquery
└── limit
├── columns: max:9
├── scalar-group-by
│ ├── columns: max:9
│ ├── project
│ │ ├── columns: s.k:5!null
│ │ └── select
│ │ ├── columns: s.k:5!null s.a:6!null
│ │ ├── outer: (2)
│ │ ├── scan s
│ │ │ └─── columns: s.k:5!null s.a:6
│ │ └── filters
│ │ └── s.a:6 = t.a:2
│ └── aggregations
│ └── max [as=max:9, outer=(5)]
│ └── s.k:5
└── 1
Single-statement UDFs that return multiple rows can also be inlined. Consider the example:
CREATE TABLE scores (
id INT PRIMARY KEY,
user_id INT,
score INT,
created_at TIMESTAMPTZ
);
CREATE FUNCTION recent_scores() RETURNS SETOF INT LANGUAGE SQL AS $$
SELECT score FROM scores WHERE created_at >= now() - '1 hour'::INTERVAL;
$$;
SELECT s AS top_recent_score FROM recent_scores() AS r(s) ORDER BY s LIMIT 10;
recent_scores() can be inlined directly into the query plan:
limit
├── columns: top_recent_score:3
├── ordering: +3
├── sort
│ ├── columns: score:3
│ ├── limit hint: 10.00
│ └── project
│ ├── columns: score:3
│ └── select
│ ├── columns: id:1!null user_id:2 score:3 created_at:4!null
│ ├── scan scores
│ │ └── columns: id:1!null user_id:2 score:3 created_at:4
│ └── filters
│ └── created_at:4 >= (now() - '01:00:00')
└── 10
If a routine is inlined, no changes to the optimizer will be required to accurately estimate their row counts and costs. They will be transformed into existing relational and scalar expressions, and the statistic builder and coster won't even be aware that they originated from a UDF.
When a routine is not inlined, it will be difficult to accurately estimate their
cost. With non-constant arguments, the routine's list of relational expressions
may not be fully optimized and the number of rows returned by the routine may
not be known. Therefore, we will assume that a set returning routine returns
1000 rows and the cost of a routine is 1000 times the sequential IO cost
factor.
In the future, the COST and ROWS options will allow users to hint the row
count and cost to the optimizer so that it can better optimize queries that call
UDFs.
Postgres's stored
procedures are similar to
user-defined functions, but can only be invoked in a CALL <procedure>
statement. This more restrictive nature of store procedures should allow them to
be implemented fairly easily once UDFs are supported.
With the current design, any objects referenced in the function body cannot be renamed. UDF would be broken because the renamed object cannot be resolved with the old name and we simply disallow name changes to referenced objects. This can be fixed if we can rewrite the function body so that all references are represented by OIDs in queries. Or we can store the function body in the format of an AST in which all table/view/function and even column/index are all referenced by OID, so that function body queries can be regenerated dynamically.
Note that the same problem exists with views, the work to fix it would overlap heavily to support renaming objects referenced in UDFs. This will be one of the first few things we need to work on after the initial phase is complete.
Cross-referenced objects changed within one transaction should be seen atomically. For example, say, we have existing UDF and table:
CREATE FUNCTION toy_func() RETURNS INT LANGUAGE SQL $$ SELECT 1 $$;
CREATE TABLE t (a INT PRIMARY KEY, b INT DEFAULT (toy_func(b) + 1));
We modify them in a transaction:
BEGIN;
CREATE OR REPLACE FUNCTION toy_func() RETURNS INT LANGUAGE SQL $$ SELECT -1 $$;
ALTER TABLE t ALTER COLUMN b SET DEFAULT (toy_func(b) - 1);
COMMIT;
Any inserts into table t should lead to default values of either 2 or -2 for
column b. Any other results caused by interleaved versions of the two objects
should be considered invalid. This is a quite minor problem we currently have in
our leasing system that needs to be fixed for correctness.
Supporting recursive functions and cycles between functions (e.g., fn1 calls
fn2 and fn2 calls fn1) will require changes to query planning and
execution. In particular, we'll need a way to reference a routine expression in
the query plan to avoid building an expression tree with infinite depth.
We'll likely want to add support for
PL/pgSQL UDFs in the
near future because they are commonly used. We'd need to add execution support
for VARIABLEs, IFs, LOOPs, and RETURNs. The Froid
paper is a good starting
point for optimizing these constructs into relational expressions, though these
optimizations aren't required to support PL/pgSQL (Postgres does not perform
these optimizations).
Supporting other procedural languages, like C, Python, or Perl, will be more difficult. Early experiments of supporting procedural languages with a WebAssembly runtime have shown promise.
As far as we know, nothing in this proposal prevents us from supporting schema
changes in UDFs. Performing query planning of the statements within a UDF during
the planning stage of the calling statement is valid because statements within a
UDF see a snapshot of the catalog from the start of the transaction, even if
they are VOLATILE (see Appendix C). That being said, the
usefulness of UDFs with schema changes in a production application is unclear,
and supporting them would require significant testing to ensure correct behavior
with all types of schema changes.
This won't be supported at least in the initial version but can be explored to support more complex filtering.
In theory it is possible to distribute parts of a query plan even if a UDF
planNode must be executed on the gateway node. DistSQL does not currently support
this type of partial distribution. Lifting this restriction would benefit the
execution of apply-joins as well by allowing parts of a query plan with an
apply-join to be distributed.
This example shows how mutations to data made by a statement calling a UDF are
seen by a VOLATILE UDF, but not by a STABLE UDF. Notice how incr_t_a() and
get_t_a_volatile() return the incrementing value while get_t_a_stable() sees
the value as of the start of the query.
postgres=# CREATE TABLE t (k INT PRIMARY KEY, a INT);
CREATE TABLE
postgres=# INSERT INTO t VALUES (1, 0);
INSERT 0 1
postgres=# CREATE FUNCTION incr_t_a() RETURNS INT LANGUAGE SQL AS $$
postgres$# UPDATE t SET a = a + 1 WHERE k = 1 RETURNING a;
postgres$# $$;
CREATE FUNCTION
postgres=# CREATE FUNCTION get_t_a_stable() RETURNS INT STABLE LANGUAGE SQL AS $$
postgres$# SELECT a FROM t WHERE k = 1;
postgres$# $$;
CREATE FUNCTION
postgres=# CREATE FUNCTION get_t_a_volatile() RETURNS INT VOLATILE LANGUAGE SQL AS $$
postgres$# SELECT a FROM t WHERE k = 1;
$$;
postgres$# CREATE FUNCTION
postgres=# SELECT i, incr_t_a(), get_t_a_stable(), get_t_a_volatile() FROM generate_series(1, 10) g(i);
i | incr_t_a | get_t_a_stable | get_t_a_volatile
----+----------+----------------+------------------
1 | 1 | 0 | 1
2 | 2 | 0 | 2
3 | 3 | 0 | 3
4 | 4 | 0 | 4
5 | 5 | 0 | 5
6 | 6 | 0 | 6
7 | 7 | 0 | 7
8 | 8 | 0 | 8
9 | 9 | 0 | 9
10 | 10 | 0 | 10
(10 rows)
This example shows how Postgres does not verify that the given volatility of a UDF matches the volatility of the statements in the UDF body. This allows users to do things like create a computed column with a UDF that is marked as immutable but is actually volatile, which is dangerous because it can corrupt indexes and cause incorrect query results.
postgres=# CREATE TABLE t (a INT GENERATED ALWAYS AS (rand()) STORED);
ERROR: 42P17: generation expression is not immutable
LOCATION: cookDefault, heap.c:3103
postgres=# CREATE FUNCTION immutable_rand() RETURNS IMMUTABLE INT LANGUAGE SQL AS $$
postgres$# SELECT rand();
postgres$# $$;
CREATE FUNCTION
postgres=# CREATE TABLE t (a INT GENERATED ALWAYS AS (immutable_rand()) STORED);
CREATE TABLE
These examples show how UDFs in Postgres see a snapshot of the catalog from the start of the transaction, even if they are volatile. This first example shows how Postgres doesn't allow creating a UDF that creates a table and inserts into that table.
postgres=# CREATE FUNCTION create_and_insert() RETURNS BOOL VOLATILE LANGUAGE SQL AS $$
postgres$# CREATE TABLE t (k INT PRIMARY KEY, a INT);
postgres$# INSERT INTO t VALUES (1, 10);
postgres$# $$;
ERROR: 42P01: relation "t" does not exist
LINE 3: INSERT INTO t VALUES (1, 10);
^
LOCATION: parserOpenTable, parse_relation.c:1384
We can get around this by creating a table, then creating the function, and finally dropping the table. When the function is called, we see that the insert fails because the newly created table is not visible.
postgres=# CREATE TABLE t (k INT PRIMARY KEY, a INT);
CREATE TABLE
postgres=# CREATE FUNCTION create_and_insert() RETURNS BOOL VOLATILE LANGUAGE SQL AS $$
postgres$# CREATE TABLE t (k INT PRIMARY KEY, a INT);
postgres$# INSERT INTO t VALUES (1, 10);
postgres$# SELECT true;
postgres$# $$;
CREATE FUNCTION
postgres=# DROP TABLE t;
DROP TABLE
postgres=# SELECT create_and_insert();
ERROR: 42P01: relation "t" does not exist
LINE 3: INSERT INTO t VALUES (1, 10);
^
QUERY:
CREATE TABLE t (k INT PRIMARY KEY, a INT);
INSERT INTO t VALUES (1, 10);
SELECT true;
CONTEXT: SQL function "create_and_insert" during startup
LOCATION: parserOpenTable, parse_relation.c:1384
This example shows how UDFs in Postgres, when defined with string function
bodies where references are resolved at execution time, can cause index
corruption. Notice how the last two SELECT statements return different results
because one performs an index-only scan of the corrupt secondary index, while
the other performs a full table scan.
postgres=# CREATE FUNCTION f1() RETURNS INT IMMUTABLE LANGUAGE SQL AS 'SELECT 1';
CREATE FUNCTION
postgres=# CREATE FUNCTION f2() RETURNS INT IMMUTABLE LANGUAGE SQL AS 'SELECT f1()';
CREATE FUNCTION
postgres=# CREATE TABLE t (k INT PRIMARY KEY, a INT);
CREATE TABLE
postgres=# CREATE INDEX idx ON t ((a + f2())) INCLUDE (k, a);
CREATE INDEX
postgres=# INSERT INTO t(k, a) SELECT i, i FROM generate_series(2, 100000) g(i);
INSERT 0 99999
postgres=# ANALYZE t;
ANALYZE
postgres=# CREATE OR REPLACE FUNCTION f1() RETURNS INT IMMUTABLE LANGUAGE SQL AS 'SELECT 2';
CREATE FUNCTION
postgres=# INSERT INTO t(k, a) VALUES (1, 1);
INSERT 0 1
postgres=# CREATE OR REPLACE FUNCTION f1() RETURNS INT IMMUTABLE LANGUAGE SQL AS 'SELECT 1';
CREATE FUNCTION
postgres=# ANALYZE t;
ANALYZE
postgres=# EXPLAIN SELECT * FROM t WHERE a + f2() = 2;
QUERY PLAN
------------------------------------------------------------------
Index Only Scan using idx on t (cost=0.29..8.31 rows=1 width=8)
Index Cond: (((a + 1)) = 2)
(2 rows)
postgres=# SELECT * FROM t WHERE a + f2() = 2;
k | a
---+---
(0 rows)
postgres=# DROP INDEX idx;
DROP INDEX
postgres=# SELECT * FROM t WHERE a + f2() = 2;
k | a
---+---
1 | 1
(1 row)