design/20200326.extensible-certificate-controller.md
The Certificate controller is one of the most commonly used controllers in the project. It represents the 'full lifecycle' of an x509 private key and certificate, including private key management and renewal.
Internally, the controller is implemented in a fairly straightforward way. We have a single controller which is responsible for:
spec.renewBefore)The above list is non-exhaustive.
This document proposes an alternate way of structuring this controller, to improve reliability, testability and extensibility.
As the project is maturing, more requirements around this controller are starting to become apparent.
We have outstanding feature requests that are currently difficult to implement with the existing design:
This proposal aims to facilitate the above features, as well as make it easier to develop individual areas of the controller over time and continue to make improvements.
As noted above, the existing logic for the Certificates controller is a single loop which is responsible for reconciling all of the aforementioned areas of the Certificate resource.
Instead, the Certificates controller will split into a number of distinct controllers, each
with their own well-defined responsibilities, that communicate via the Certificate resource's
status field.
keymanager - generates and stores private keys when an issuance is required.
Manages the status.nextPrivateKeySecretName field.requestmanager - creates and manages CertificateRequest resources for Certificates when
an issuance is required.issuing - issues the signed x509 certificate and 'next private key' into the spec.secretName
when the CertificateRequest is valid. Manages the status.revision field.
Responsible for removing the Issuing condition.trigger - monitors the Secret resource and certificate.spec and adds the Issuing
condition when issuance is required.In order to facilitate communication and cooperation between these controllers, some API changes
are required to contain computed state to be consumed by other controllers.
These additional fields will be encompassed in the certificate.status stanza.
package v1alpha3
type CertificateStatus struct {
// EXISTING FIELDS HERE
// ADDITIONAL FIELDS
// The current 'revision' of the certificate as issued.
//
// When a CertificateRequest resource is created, it will have the
// `cert-manager.io/certificate-revision` set to one greater than the
// current value of this field.
//
// Upon issuance, this field will be set to the value of the annotation
// on the CertificateRequest resource used to issue the certificate.
//
// Persisting the value on the CertificateRequest resource allows the
// certificates controller to know whether a request is part of an old
// issuance or if it is part of the ongoing revision's issuance by
// checking if the revision value in the annotation is greater than this
// field.
//
// A CertificateRequest with no annotation that is owned by a Certificate
// resource will be automatically deleted in order to not complicate the
// 'renewal required' issuance logic. This means that any requests in user
// clusters will be deleted upon upgrade, and in cases where a re-issuance
// is required, another will be created with the appropriate `revision`.
//
// +optional
Revision *int `json:"revision,omitempty"`
// The name of the Secret resource containing the private key to be used
// for the next certificate iteration.
// The keymanager controller will automatically set this field if the
// `Issuing` condition is set to `True`.
// It will automatically unset this field when Issuing is not set or False.
// +optional
NextPrivateKeySecretName *string `json:"nextPrivateKeySecretName,omitempty"`
}
type CertificateCondition string
var (
// A condition added to Certificate resources when an issuance is required.
// This condition will be automatically added and set to true if:
// * No keypair data exists in the target Secret
// * The data stored in the Secret cannot be decoded
// * The private key and certificate do not have matching public keys
// * If a CertificateRequest for the current revision exists and the
// certificate data stored in the Secret does not match the
// `status.certificate` on the CertificateRequest.
// * If no CertificateRequest resource exists for the current revision,
// the options on the Certificate resource are compared against the
// x509 data in the Secret, similar to what's done in earlier versions.
// If there is a mismatch, an issuance is triggered.
//
// The final case above where no CertificateRequest resource exists is
// essential for backwards compatibility, as older CertificateRequest
// resources will not have a revision assigned so we cannot compare against
// them. In these cases, we fall back to the behaviour we use today of
// comparing the Certificate spec to the issued x509 certificate in the
// Secret.
//
// This condition may also be added by external API consumers to trigger
// a re-issuance manually for any other reason.
//
// It will be removed by the 'issuing' controller upon complete issuance.
CertificateConditionIssuing CertificateCondition = "Issuing"
)
At the core of this proposal is the addition of the status.iteration field (an integer).
This field indicates the current 'version' of the certificate.
When a certificate is first created, the iteration is set to 1.
We will add a new field to the CertificateStatus structure:
package v1alpha3
type CertificateStatus struct {
...
// The name of the Secret resource
// +optional
NextPrivateKeySecretName string `json:"nextPrivateKeySecretName,omitempty"`
}
Implementing this proposal will require a complete replacement of the current
certificates controller. Almost all areas of code will be replaced.
The 'compare a certificate.spec with an x509 certificate' logic can and
should be preserved to help maintain some semblance of backward compatibility
for users upgrading from previous releases.
The keymanager controller will be responsible for maintaining the
status.nextPrivateKeySecretName field and any 'next private key' Secret
resources that are owned by Certificates.
If the Issuing condition is True:
status.nextPrivateKeySecretName field is not set:
status.nextPrivateKeySecretName field.
This handles cache inconsistencies when we observe Secret
creation before updating status.nextPrivateKeySecretName.spec.privateKey and store it in a new Secret resource.
Persist the name of the Secret as status.nextPrivateKeySecretName.status.nextPrivateKeySecretName field is set:
spec.privateKey:
spec.privateKey and
create a Secret resource with the given name.spec.privateKey:
spec.privateKey and
store it in the Secret resource.If the Issuing condition is False or not set:
cert-manager.io/next-private-key: "true"status.nextPrivateKeySecretName is unset - we may want to
consider not doing this in case a user has manually specified this field
and pointed it at an 'un-owned' Secret. This depends on whether we want to
support this as a mode of operation.When creating a 'next private key' Secret resource, the
cert-manager.io/next-private-key: "true" annotation is added as well as an
OwnerReference to the Certificate resource.
Private keys generated by the key manager will be encoded in PKCS#8 format for
ease of interoperability. The issuing controller will encode the resulting
key-pair into the format requested by the user once the request has been
completed.
The introduction of this dedicated controller also means we can implement
private key rotation when a certificate is re-issued/renewed.
This is a welcome new feature, but in some cases a user may want to pin the
private key used for a key-pair (as is the default and only supported behaviour
prior to implementing this design).
To continue to enable this, the spec.privateKey.rotationPolicy field
controls how the next private key should be sourced. It supports two values:
Always: a new private key will be generated on every re-issuance.
This includes renewals as well as changes to the spec.privateKey and other
spec fields.Never: private keys will never been regenerated and must be provided by the
user.TODO: We may be better to have
spec.privateKey.pinnedSecretNameinstead, to name a Secret that contains the private key to use. We could then require this key to be provided in a specific format. With therotationPolicypolicy design, a user must pre-create the Secret resource containing their private key ahead of time, which creates a conflict of ownership and confusion.
If the spec.privateKey options change during an issuance, the key will be
regenerated and the same Secret resource will be reused to store the updated
private key. Consumers reading this Secret (i.e. the requestmanager) must
compare the public key of the named 'next private key' and if the public key
does not match with the CertificateRequest being managed, the
CertificateRequest should be recreated.
The requestmanager is responsible for managing a CertificateRequest resource
for a Certificate. If a Certificate has the Issuing condition, it will ensure
a CertificateRequest signed by the status.nextPrivateKeySecretName exists and
matches the specification for the Certificate in certificate.spec.
This is the controller that most closely resembles the bulk of the logic in the
existing certificates controller.
It will behave as follows:
Issuing condition is True:
status.nextPrivateKeySecretName field is not set:
status.revision + 1 (or 1 if
status.revision is not set).
certificate.spec.status.nextPrivateKeySecretName
certificate.spec
Issuing condition is False or not set:
status.revision.This controller will copy the status.certificate and status.ca fields from
valid CertificateRequest resources and the tls.key from the
status.nextPrivateKeySecretName Secret resource into the spec.secretName.
It is responsible for encoding the private key and certificate data into the appropriate format.
Once the key-pair has been written to spec.secretName, it will set the
status.revision field to that of the CertificateRequest and remove the
Issuing status condition (in the same Update call).
It will behave as follows:
True:
status.nextPrivateKeySecretName field is not set:
status.revision + 1
requestmanager will handle this)status.nextPrivateKeySecretName
requestmanager will handle this)certificate.spec
requestmanager will handle this)status.revision to revision of the CertificateRequestIssuing status conditionstatus.lastFailureTime (if set)status.lastFailureTime (if not equal to CertificateRequest)Issuing status condition to False with reason explaining when
the request will be retriedFalse or not set:
The trigger plugin is responsible for observing the state of the currently
issued spec.secretName and the rest of the certificate.spec fields to
determine whether a re-issuance is required.
It triggers re-issuance by adding the Issuing status condition when a new
certificate is required.
These conditions will cause a request to be triggered:
spec.secretName does not exist, or contains data that cannot be decoded.spec.issuerRef.tls.key and tls.crt do not match.status.revision exists and the
current requested certificate.spec does not match the options on the CSR.NotAfter is within spec.renewBefore of
the current time.Additionally, the controller is responsible for implement a 'back-off' if
CertificateRequest resources persistently fail to complete.
For now, this will be a continuation of the current behaviour of backing off
by 1 hour after a request fails.
Even if any of the above conditions are true, a request will not be triggered
unless the current time is at least 1 hour after the status.lastFailureTime,
if set.
In future we can extend this logic to back-off exponentially by storing a
longer history of CertificateRequest resource's failure times, but this is out
of scope of this proposal.
Another actor may choose to manually trigger an issuance by setting the
Issuing condition themselves (i.e. with a cert-manager CLI tool, or their own
controller). The trigger controller will not interfere in this case and
will take no extra action.