docs/RFCS/20220307_user_ids.md
A user and all of their privileges in CockroachDB are identified by their username. To implement ALTER ROLE ... RENAME TO ... statement is difficult as you need to create a new role with the same privileges and then drop the old one
(mentioned in #50821).
We will support stable IDs to key internal user information instead of usernames. It requires a long-running migration to give existing users IDs. The type of the user IDs is yet to be decided.
Currently, all internal per-user information is keyed by the username. It's not ideal because of the complications to
implement ALTER ROLE ... RENAME TO ... statement and usernames can be considered personal data, but we need to use
them in many places in logging. Implementing this in CockroachDB will enhance compatibility with the privilege system of
Postgres, where pg_user has users have IDs in usesysid column.
The idea is to convert code dependencies of username and lookups to use the IDs instead in 3 steps. First adding the ID columns to system tables and update related syntax statements. Second, updating the user privileges objects and lookup based on usernames. Finally, migrating the existing instances of Cockroach.
The ids will be randomly generated OID when creating a new user during insert into system.users. SQL statements like insert, lookups and delete for all other tables will be updated such that the IDs from the users table are used.
[email protected]:26257/defaultdb> \d system.users
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
-----------------+-----------+-------------+----------------+-----------------------+-----------+------------
username | STRING | false | NULL | | {primary} | false
hashedPassword | BYTES | true | NULL | | {primary} | false
isRole | BOOL | false | false | | {primary} | false
(3 rows)
[email protected]:26257/defaultdb> \d system.role_options
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
--------------+-----------+-------------+----------------+-----------------------+-----------+------------
username | STRING | false | NULL | | {primary} | false
option | STRING | false | NULL | | {primary} | false
value | STRING | true | NULL | | {primary} | false
(3 rows)
[email protected]:26257/defaultdb> \d system.role_members
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
--------------+-----------+-------------+----------------+-----------------------+---------------------------------------------------------+------------
role | STRING | false | NULL | | {primary,role_members_member_idx,role_members_role_idx} | false
member | STRING | false | NULL | | {primary,role_members_member_idx,role_members_role_idx} | false
isAdmin | BOOL | false | NULL | | {primary} | false
(3 rows)
[email protected]:26257/defaultdb> \d system.database_role_settings
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
--------------+-----------+-------------+----------------+-----------------------+-----------+------------
database_id | OID | false | NULL | | {primary} | false
role_name | STRING | false | NULL | | {primary} | false
settings | STRING[] | false | NULL | | {primary} | false
(3 rows)
[email protected]:26257/movr> \d system.users
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
-----------------+-----------+-------------+-------------------+-----------------------+---------------------------------+------------
username | STRING | false | NULL | | {primary,users_username_id_idx} | false
hashedPassword | BYTES | true | NULL | | {primary} | false
isRole | BOOL | false | false | | {primary} | false
user_id | OID | false | nextval('id_seq') | | {primary,users_username_id_idx} | false
(4 rows)
[email protected]:26257/movr> \d system.role_options
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
--------------+-----------+-------------+----------------+-----------------------+-----------+------------
username | STRING | false | NULL | | {primary} | false
option | STRING | false | NULL | | {primary} | false
value | STRING | true | NULL | | {primary} | false
user_id | OID | false | NULL | | {primary} | false
(4 rows)
[email protected]:26257/movr> \d system.role_members
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
--------------+-----------+-------------+----------------+-----------------------+---------------------------------------------------------+------------
role | STRING | false | NULL | | {primary,role_members_member_idx,role_members_role_idx} | false
member | STRING | false | NULL | | {primary,role_members_member_idx,role_members_role_idx} | false
isAdmin | BOOL | false | NULL | | {primary} | false
role_id | OID | false | NULL | | {primary,role_members_member_idx,role_members_role_idx} | false
member_id | OID | false | NULL | | {primary,role_members_member_idx,role_members_role_idx} | false
(5 rows)
[email protected]:26257/movr> \d system.database_role_settings
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
--------------+-----------+-------------+----------------+-----------------------+-----------+------------
database_id | OID | false | NULL | | {primary} | false
role_name | STRING | false | NULL | | {primary} | false
settings | STRING[] | false | NULL | | {primary} | false
role_id | OID | false | NULL | | {primary} | false
(4 rows)
UUID allows for randomly generated IDs during insert using gen_random_uuid() as default value for the column.
gen_random_uuid().OID type that CRDB already has and mimics what postgres does with uid when creating a user (the type of usesysid
is OID). This is useful when allowing a user to specify the uid in statements for future updates. For existing users,
a hash function could be used to generate their ID. We can also take an approach similar to postgres and have the new
ID be the highest ID + 1.
pg_user virtual table, with a useful usesysid column.alter table t1 add column y oid default oid(unique_rowid());unique_rowid() can no longer be used as it generates uint64 values. So without resolving
this issue, we can migrate from postgres to CRDB, but it may not be possible vice versa.unique_rowid. This will most likely be less efficient.INT - it would be similar to OID type, and we can migrate from postgres to CRDB but not vice versa. Adding columns
with a default value of unique_rowid() also simplifies long running migration between node versions. The only
difference is that there will be no changes to INT in the near future.
postgres=# create user a;
CREATE ROLE
postgres=# SELECT * FROM pg_user;
usename | usesysid | usecreatedb | usesuper | userepl | usebypassrls | passwd | valuntil | useconfig
----------+----------+-------------+----------+---------+--------------+----------+----------+-----------
postgres | 10 | t | t | t | t | ******** | |
a | 16385 | f | f | f | f | ******** | |
(2 rows)
postgres=# create user b;
CREATE ROLE
postgres=# create user aa;
CREATE ROLE
postgres=# create user ba;
CREATE ROLE
postgres=# create user ab;
CREATE ROLE
postgres=# SELECT * FROM pg_user;
usename | usesysid | usecreatedb | usesuper | userepl | usebypassrls | passwd | valuntil | useconfig
----------+----------+-------------+----------+---------+--------------+----------+----------+-----------
postgres | 10 | t | t | t | t | ******** | |
a | 16385 | f | f | f | f | ******** | |
b | 16386 | f | f | f | f | ******** | |
aa | 16387 | f | f | f | f | ******** | |
ba | 16388 | f | f | f | f | ******** | |
ab | 16389 | f | f | f | f | ******** | |
(6 rows)
*Require more discussion on this
An ID field will be added to UserPrivileges and any objects that contain a username. The Privilege Descriptors store user privilege objects in an array in sorted order based on usernames currently. In 23.1, we can remove the username field from privileges. It will allow us to alter usernames without having to update privilege objects.
Code in privilege.pb.go:
// PrivilegeDescriptor describes a list of users and attached
// privileges. The list should be sorted by user for fast access.
type PrivilegeDescriptor struct {
Users []UserPrivileges `protobuf:"bytes,1,rep,name=users" json:"users"`
OwnerProto github_com_cockroachdb_cockroach_pkg_security.SQLUsernameProto `protobuf:"bytes,2,opt,name=owner_proto,json=ownerProto,casttype=github.com/cockroachdb/cockroach/pkg/security.SQLUsernameProto" json:"owner_proto"`
Version PrivilegeDescVersion `protobuf:"varint,3,opt,name=version,casttype=PrivilegeDescVersion" json:"version"`
}
Since most SQL syntax takes in username/role, we also require a lookup function to get the user ID from system.users by a username. Additionally, add caching the lookups to reduce the overall runtime of statements and tests.
We need to add the new system table columns to existing instances and migrate all the existing users and generate their corresponding IDs. The migration will need to update the UserPriviliges to contain the new user IDs.
During the migration, we will need to update every field where a username is stored to also store an ID. The only places right now are the various system tables, user privileges and default privileges. Update the existing Privilege descriptors using one of the lookup implementation options mentioned above in order to support the lookups by user IDs.
We’ll need to version gate all lookups to ensure the use of username until all the nodes have been migrated to 22.1. It ensures that if the ID field does not exist in UserPrivileges of all nodes, it doesn’t result in conflicts during lookups. Once all nodes are upgraded from 21.2 to 22.1, the version gate will allow privilege lookups using IDs.
| 22.1 & Mixed Cluster | 22.2 |
|---|---|
| User privilege lookups are done using username. ID field in 22.2 version nodes is ignored. | ID is acquired from system.user and user privilege lookups are done using IDs instead. |