docs/RFCS/20170628_web_session_login.md
The Cockroach Server currently provides a number of HTTP endpoints (the Admin UI and /debug endpoints) which return data about the cluster; however, these endpoints are not secured behind with any sort of authorization or authentication mechanism. This RPC proposes an authentication method which will require each incoming request to be associated with a login session. A login session will be created using a username/password combination. Incoming requests which are not associated with a valid session will be rejected.
Changes to the CockroachDB server will be the addition of a sessions table, the addition of a new RPC endpoint to allow the Admin UI to create a new session, and a modification of our current RPC gateway server to enforce the new session requirement for incoming requests. Also included in this proposal is a method for preventing CSRF (Cross-site request forgery) attacks.
The AdminUI will be modified to require that users create a new session by "signing in" before the UI becomes usable. Outgoing requests will be modified slightly to utilize the new CSRF mitigation method.
This RPC does not propose any new authorization mechanisms; it only authenticates sessions, without enforcing any specific permissions on different users.
Long-term goals for the UI include many scenarios in which users can actually modify CockroachDB settings through the UI. Proper authentication and authorization are a hard requirement for this; not all users of the system should have access to modify it, and any modifications made must be auditable.
Even for the read-only scenarios currently enabled by the Admin UI, we are not properly enforcing permissions; for example, all Admin UI users can see the entire schema of the cluster, while database users can be restricted to seeing a subset of databases and tables.
The design has the following components:
A "web_sessions" table which holds information about currently valid user sessions.
A "login" RPC accessible from the Admin UI over HTTP. This RPC is called with a username/password pair, which are validated by the server; if valid, a new session is created and a cookie with a session ID is returned with the response.
All Admin UI methods, other than the new login request, will be modified to require a valid session cookie be sent with incoming http requests. This will be done at the service level before dispatching a specific method; the session's username will be added to the method context if the session is valid.
The Admin UI will be modified to require a logged-in session before allowing users to navigate to the current pages. If the user is not logged in, a dialog will be displayed for the user to input a username and password.
A CSRF mitigation method will be added to both the client and the server.
A new metadata table will be created to hold system sessions:
CREATE TABLE system.web_sessions {
id SERIAL PRIMARY KEY,
"hashedSecret" BYTES NOT NULL,
username STRING NOT NULL,
"createdAt" TIMESTAMP NOT NULL DEFAULT now(),
"expiresAt" TIMESTAMP NOT NULL,
"revokedAt" TIMESTAMP,
"lastUsedAt" TIMESTAMP NOT NULL DEFAULT now(),
"auditInfo" STRING,
INDEX(expiresAt),
INDEX(createdAt),
}
id is the identifier of the session, which is of type SERIAL (equivalent to
INT DEFAULT unique_rowid()).
hashedSecret is the hash of a cryptographically random byte array generated
and shared only with the original creator of the session. The server does not
store the original bytes generated, but rather hashes the generated value and
stores the hash. The server requires incoming requests to have the original
version of the secret for the provided session id. This allows the session's id
to be used in auditing logs which are readable by non-root users; if we did not
require this secret, any user with access to auditing logs would be able to
impersonate sessions.
username stores the username used to create a session.
Each session records four important timestamps:
createdAt indicates when the session was created. When a new session is
created, this is set to the current time.expiresAt indicates when the session will expire. When a new session is
created, this is set to (current time + server.web_session_timeout);
web_session_timeout is a new server setting (a duration) added as part of this
RFC.revokedAt is a possibly null timestamp indicating when the session was
explicitly revoked. If it has not been revoked, this field is null. When a new
session is created, this field is null.lastUsedAt is a field being used to track whether a session is active. It
will be periodically refreshed if a session continues to be used; this can help
give an impression of whether a session is still being actively used after it is
created. When a new session is created, this is set to the current time.The auditInfo string field contains an optional JSON-formatted string
object containing other information relevant for auditing purposes. Examples of
the type of information that may be included:
Secondary indexes are added on the following fields:
expiresAt, which quickly allows an admin to query for possibly active
sessions.createdAt, which allows querying sessions in order of creation. This should
be useful for auditing purposes.The new table will be added using a backwards-compatible migration; the same fashion used for the jobs and settings tables. This is trivially possible because no existing code is interacting with these tables.
The migration system, along with instructions for adding a new table, have an
entry point in pkg/migrations/migrations.go.
Sessions will be created by calling a new HTTP endpoint "UserLogin". This new method will be on its own new service "Authentication" separate from existing services (Status/Admin); this is for reasons that will be explained in the Session Enforcement session.
message UserLoginRequest {
string username = 1;
string password = 2;
}
message UserLoginResponse {
// No information to return.
}
message UserLogoutRequest {
// No information needed.
}
message UserLogoutResponse {
// No information to return.
}
service Authentication {
rpc UserLogin(UserLoginRequest) returns (UserLoginResponse) {
// ...
}
rpc UserLogout(UserLogoutRequest) returns (UserLogoutResponse) {
// ...
}
}
When called, UserLogin will check the provided username/password pair against
the system.users table, which stores usernames along with hashed versions of
passwords.
If the username/password is valid, a new entry is inserted into the
web_sessions table and a 200 "OK" response is returned to the client. If the
username or password is invalid, a 401 Unauthorized response is returned.
For successful login responses a new cookie "session" is created containing the ID and secret of the newly created session. This is used to associate future requests from the client with the session.
Cookie headers can be added to the response using GRPC's SetHeader method. These are attached as grpc metadata, which is then converted by GRPC Gateway into the appropriate HTTP response headers.
The "session" cookie is marked as "httponly" and is thus not accessible from javascript; this is a defense-in-depth measure to prevent session exfiltration by XSS attacks. The cookie is also marked as "secure", which prevents the cookie from being sent over an unsecured http connection.
The UserLogin method, when called successfully, will revoke the current session
by setting its revokedAt field to the current time. It will then return the
appropriate headers to delete the "session"
Session enforcement is handled by adding a new muxing wrapper around the existing "gwMux" used by grpc gateway services. Notably, the new wrapper will only be added for the existing services; it will not be added for the new Authentication service, because the UserLogin method must be accessible without a session.
The new mux will enforce that all incoming connections have a valid session token. A valid request will pass all of the following checks:
revokedAt timestamp is null.expiresAt timestamp is in the future.If any of the above criteria are not true, the incoming request is rejected with a 401 Unauthorized message.
If the the session does pass the above criteria, then the username and session ID are added to the context.Context before passing the call to gwMux. These values can later be accessed in specific methods by accessing them from the context.
If the session's lastUsedAt field is older than new system setting
server.web_session_last_used_refresh, the session record will be updated
to set its lastUsedAt value to the current time.
CSRF is enforced using the "Double-submit cookie" method. This has two parts:
Any request that does not have a matching "csrf-token" cookie and "x-csrf-token" header is rejected with a 401 Unauthorized error.
In order for this to work, the "csrf-token" cookie needs to be created at some
point. This is done on initial page load, when the server retrieves the static
assets (accomplished by wrapping the http.FileServer currently used with an
outer handler that sets the cookie. The outer handler will only set the cookie
for the main entry point of the website, index.html). The cookie contains a
cryptographically random generated string. The generated value does not need to
be stored anywhere on the server side. The cookie should have the secure
attribute.
CSRF involves an attacker's third-party website constructing a request to your website, where the request contains some sort of malicious action. If the user is logged in to your website, their valid session cookie will be sent with the request and the action will be authorized. "Double-cookie" prevents this by requiring the requester to set the "x-csrf-token" header based on a domain-specific cookie; the third-party website cannot access that cookie, so it cannot properly set the header. Simply explained, "Double cookie" works by requiring the sender to prove that it can read cookies from a trusted origin.
The random value must be set on the server because it must be cryptographically random; the client application is javascript and does not have reliable access to a reliable random source, and attackers could possibly guess any random values generated.
CSRF protection will be added to all GRPC gateway services, including the new Authentication service (which does not itself require authentication, but will require CSRF protection).
A small number of "debug" pages are not part of the Admin UI application, but are instead provided as HTML generated directly from Go. These endpoints can also be moved underneath the session enforcement wrapper to require a valid login session.
However, we will need to redirect users to the Admin UI in order for them to actually create a session; therefore, the wrapper for these methods will not return a "401" error response, but rather a redirection to the Admin UI.
Because these pages are being migrated to the Admin UI proper, there is no pressing need to improve the user experience beyond the redirect.
If the cluster is running explicitly in "insecure" mode, the following changes will apply:
Insecure mode is trivially indicated to the client by serving from "http"
addresses instead of "https". The client can thus check javascript variable
window.location.protocol and adjust its behavior accordingly.
The front-end will be modified to have a "logged in" and "logged out" state. The logged-in state will be equivalent to the current behavior of the UI. The "logged out" state displays a full-screen prompt for the user to enter credentials.
This mode will be enforced with the following mechanism:
The Login Dialog is a full-screen component that prompts the user for a username and password.
This component will be displayed by the top-level "Layout" element if the "logged in" value is not set in the Admin UI State. While the dialog is displayed, no other navigation controls are accessible.
Upon successful login, the route requested by the user before seeing the login prompt will be rendered.
After login, all pages will display a "log out" option in the top right corner of the screen.
In order to perform CSRF properly, all outgoing requests will be need to read
the value of "csrf-token" cookie and send it back to the server in the
"x-csrf-token" HTTP header. This can be added in a central location at the method
timeoutFetch.
The primary drawback of this login system is that it requires a session lookup for every request. If that proves expensive, the cost of this could be significantly mitigated by adding a short-term cache for sessions on each node.
One alternatives considered was a "Stateless Session", where there would be no sessions table, but instead session information would be encoded using a "Javascript Web Token" (JWT). This is a signed object returned to the user instead of a session token; the object contains the username, csrf token, and session id. Using JWT, servers would not need to consult a sessions table for incoming requests, but instead would simply need to verify the signature on the token.
The major issue with JWT is that it does not provide a way to revoke login sessions; to do this, we would need to store a blocklist of revoked session IDs, which removes much of the advantage of not having the sessions table in the first place.
JWT would also require all machines on the cluster to maintain a shared secret for signing the tokens.
The simplest option may be HTTP Basic authorization, wherein the browser allows the user to "log in", and sends a username/password combination with every request to the server. This requires no implementation on our part beyond verifying username/password, and in combination with HTTPS there is very little risk of being compromised by an attacker.
However, this has two main drawbacks:
The "401 Unauthorized" response seems to be the most semantically correct HTTP code to return for unauthenticated attempts to access API methods. However, returning a 401 from a request seems to cause browsers to display username/password dialog for HTTP Basic auth, which we do not want to happen.
We can try returning 401 responses without the WWW-Authenticate method, but that seems to be in violation of HTTP standards. We could try sending a custom value in the WWW-authenticate field (such as "custom"), but it's also not clear that this will prevent the browser from preventing a login dialog pop-up.
If 401 proves to be problematic in this way, we will instead send 403 Forbidden in all cases where 401 is used in this RFC.