docs/book/src/getting-started.md
This guide creates a sample project to show you how it works. This sample:
By following the Operator Pattern, it’s possible not only to provide all expected resources but also to manage them dynamically, programmatically, and at execution time. To illustrate this idea, imagine if someone accidentally changed a configuration or removed a resource by mistake; in this case, the operator could fix it without any human intervention.
</aside> <aside class="note" role="note"> <p class="note-title">Following Along vs Jumping Ahead</p>Note that most of this tutorial is generated from literate Go files that form a runnable project, and live in the book source directory: docs/book/src/getting-started/testdata/project.
</aside>First, create and navigate into a directory for your project. Then, initialize it using kubebuilder:
mkdir $GOPATH/memcached-operator
cd $GOPATH/memcached-operator
kubebuilder init --domain=example.com
If you initialize your project within GOPATH, the implicitly called go mod init will interpolate the module path for you.
Otherwise --repo=<module path> must be set.
Read the Go modules blogpost if unfamiliar with the module system.
</aside>Next, create the API which is responsible for deploying and managing Memcached(s) instances on the cluster.
kubebuilder create api --group cache --version v1alpha1 --kind Memcached
This command's primary aim is to produce the Custom Resource (CR) and Custom Resource Definition (CRD) for the Memcached Kind.
It creates the API with the group cache.example.com and version v1alpha1, uniquely identifying the new CRD of the Memcached Kind.
By leveraging the Kubebuilder tool, you can define your APIs and objects representing your solutions for these platforms.
While this example adds only one Kind of resource, you can have as many Groups and Kinds as necessary.
To make it easier to understand, think of CRDs as the definition of our custom Objects, while CRs are instances of them.
Groups and Versions and Kinds, oh my!.
</aside>Now, define the values that each instance of your Memcached resource on the cluster can assume. In this example, the configuration allows setting the number of instances with the following:
type MemcachedSpec struct {
...
// +kubebuilder:validation:Minimum=0
// +required
Size *int32 `json:"size,omitempty"`
}
The controller also needs to track the status of operations done to manage the Memcached CR(s). This allows verification of the Custom Resource's description of your API and determines if everything occurred successfully or if any errors were encountered, similar to how you would with any resource from the Kubernetes API.
// MemcachedStatus defines the observed state of Memcached
type MemcachedStatus struct {
// +listType=map
// +listMapKey=type
// +optional
Conditions []metav1.Condition `json:"conditions,omitempty"`
}
Kubernetes has established conventions, and because of this, use Status Conditions here. Your custom APIs and controllers should behave like Kubernetes resources and their controllers, following these standards to ensure a consistent and intuitive experience.
Please ensure that you review: Kubernetes API Conventions
</aside>Furthermore, validate the values added in your CustomResource
to ensure that those are valid. To achieve this, use markers,
such as +kubebuilder:validation:Minimum=1.
Now, see our example fully completed.
{{#literatego ./getting-started/testdata/project/api/v1alpha1/memcached_types.go}}
To generate all required files:
Run make generate to create the DeepCopy implementations in api/v1alpha1/zz_generated.deepcopy.go.
Then, run make manifests to generate the CRD manifests under config/crd/bases and a sample for it under config/samples.
Both commands use controller-gen with different flags for code and manifest generation, respectively.
<details><summary><code>config/crd/bases/cache.example.com_memcacheds.yaml</code>: Our Memcached CRD</summary>{{#include ./getting-started/testdata/project/config/crd/bases/cache.example.com_memcacheds.yaml}}
The manifests located under the config/samples directory serve as examples of Custom Resources that can be applied to the cluster.
In this particular example, by applying the given resource to the cluster, we would generate
a Deployment with a single instance size (see size: 1).
{{#include ./getting-started/testdata/project/config/samples/cache_v1alpha1_memcached.yaml}}
In a simplified way, Kubernetes works by allowing you to declare the desired state of your system, and then its controllers continuously observe the cluster and take actions to ensure that the actual state matches the desired state. For your custom APIs and controllers, the process is similar. Remember, you are extending Kubernetes' behaviors and its APIs to fit your specific needs.
In our controller, we implement a reconciliation process.
Essentially, the reconciliation process functions as a loop, continuously checking conditions and performing necessary actions until the desired state is achieved. This process will keep running until all conditions in the system align with the desired state defined in our implementation.
Here's a pseudo-code example to illustrate this:
reconcile App {
// Check if a Deployment for the app exists, if not, create one
// If there is an error, then restart from the beginning of the reconcile
if err != nil {
return reconcile.Result{}, err
}
// Check if a Service for the app exists, if not, create one
// If there is an error, then restart from the beginning of the reconcile
if err != nil {
return reconcile.Result{}, err
}
// Look for Database CR/CRD
// Check the Database Deployment's replicas size
// If deployment.replicas size does not match cr.size, then update it
// Then, restart from the beginning of the reconcile. For example, by returning `reconcile.Result{Requeue: true}, nil`.
if err != nil {
return reconcile.Result{Requeue: true}, nil
}
...
// If at the end of the loop:
// Everything executed successfully, and the reconcile can stop
return reconcile.Result{}, nil
}
The following are a few possible return options to restart the Reconcile:
return ctrl.Result{}, err
return ctrl.Result{Requeue: true}, nil
return ctrl.Result{}, nil
return ctrl.Result{RequeueAfter: nextRun.Sub(r.Now())}, nil
When you apply the sample Custom Resource (CR) to the cluster (i.e. kubectl apply -f config/sample/cache_v1alpha1_memcached.yaml),
ensure that the controller creates a Deployment for the Memcached image and that it matches the number of replicas you define in the CR.
To achieve this, first implement an operation that checks whether the Deployment for the Memcached instance already exists on the cluster. If it does not, the controller creates the Deployment accordingly. Therefore, our reconciliation process must include an operation to ensure that this desired state is consistently maintained. This operation would involve:
// Check if the deployment already exists, if not create a new one
found := &appsv1.Deployment{}
err = r.Get(ctx, types.NamespacedName{Name: memcached.Name, Namespace: memcached.Namespace}, found)
if err != nil && apierrors.IsNotFound(err) {
// Define a new deployment
dep := r.deploymentForMemcached()
// Create the Deployment on the cluster
if err = r.Create(ctx, dep); err != nil {
log.Error(err, "Failed to create new Deployment",
"Deployment.Namespace", dep.Namespace, "Deployment.Name", dep.Name)
return ctrl.Result{}, err
}
...
}
Next, note that the deploymentForMemcached() function needs to define and return the Deployment that should be
created on the cluster. This function should construct the Deployment object with the necessary
specifications, as demonstrated in the following example:
dep := &appsv1.Deployment{
Spec: appsv1.DeploymentSpec{
Replicas: &replicas,
Template: corev1.PodTemplateSpec{
Spec: corev1.PodSpec{
Containers: []corev1.Container{{
Image: "memcached:1.6.26-alpine3.19",
Name: "memcached",
ImagePullPolicy: corev1.PullIfNotPresent,
Ports: []corev1.ContainerPort{{
ContainerPort: 11211,
Name: "memcached",
}},
Command: []string{"memcached", "--memory-limit=64", "-o", "modern", "-v"},
}},
},
},
},
}
Additionally, implement a mechanism to verify that the number of Memcached replicas on the cluster matches the desired count specified in the Custom Resource (CR). If there is a discrepancy, the reconciliation must update the cluster to ensure consistency. This means that whenever you create or update a CR of the Memcached Kind on the cluster, the controller will continuously reconcile the state until the actual number of replicas matches the desired count. The following example illustrates this process:
...
size := memcached.Spec.Size
if *found.Spec.Replicas != size {
found.Spec.Replicas = &size
if err = r.Update(ctx, found); err != nil {
log.Error(err, "Failed to update Deployment",
"Deployment.Namespace", found.Namespace, "Deployment.Name", found.Name)
return ctrl.Result{}, err
}
...
Now, you can review the complete controller responsible for managing Custom Resources of the Memcached Kind. This controller ensures that the desired state is maintained in the cluster, making sure that our Memcached instance continues running with the number of replicas specified by the users.
<details><summary><code>internal/controller/memcached_controller.go</code>: Our Controller Implementation </summary>{{#include ./getting-started/testdata/project/internal/controller/memcached_controller.go}}
The whole idea is to be Watching the resources that matter for the controller. When a resource that the controller is interested in changes, the Watch triggers the controller's reconciliation loop, ensuring that the actual state of the resource matches the desired state as defined in the controller's logic.
Notice how you configure the Manager to monitor events such as the creation, update, or deletion of a Custom Resource (CR) of the Memcached kind, as well as any changes to the Deployment that the controller manages and owns:
// SetupWithManager sets up the controller with the Manager.
// The Deployment is also watched to ensure its
// desired state in the cluster.
func (r *MemcachedReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
// Watch the Memcached Custom Resource and trigger reconciliation whenever it
//when you create, update, or delete it
For(&cachev1alpha1.Memcached{}).
// Watch the Deployment managed by the Memcached controller. If any changes occur to the Deployment
// owned and managed by this controller, it triggers reconciliation, ensuring that the cluster
// state aligns with the desired state.
Owns(&appsv1.Deployment{}).
Complete(r)
}
The Controller should not watch any Deployment on the cluster and trigger the reconciliation loop. Instead, trigger reconciliation only when the specific Deployment running the Memcached instance is changed. For example, if someone accidentally deletes the Deployment or changes the number of replicas, trigger the reconciliation to ensure that it returns to the desired state.
The Manager knows which Deployment to observe because you set the ownerRef (Owner Reference):
if err := ctrl.SetControllerReference(memcached, dep, r.Scheme); err != nil {
return nil, err
}
The ownerRef is crucial not only for allowing the controller to observe changes on the specific resource but also because, if you delete the Memcached Custom Resource (CR) from the cluster, all resources owned by it are automatically deleted as well, in a cascading event.
This ensures that when you remove the parent resource (Memcached CR), Kubernetes also removes all associated resources (like Deployments, Services, etc.) are also cleaned up, maintaining a tidy and consistent cluster state.
For more information, see the Kubernetes documentation on Owners and Dependents.
</aside>It's important to ensure that the Controller has the necessary permissions(i.e. to create, get, update, and list) the resources it manages.
You configure the RBAC permissions via RBAC markers, which controller-gen uses to generate and update the
manifest files in config/rbac/. You can find these markers (and should define them) on the Reconcile() method of each controller, see
how the example implements them:
// +kubebuilder:rbac:groups=cache.example.com,resources=memcacheds,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=cache.example.com,resources=memcacheds/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=cache.example.com,resources=memcacheds/finalizers,verbs=update
// +kubebuilder:rbac:groups=events.k8s.io,resources=events,verbs=create;patch
// +kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch
After making changes to the controller, run the make manifests command. This will prompt controller-gen
to refresh the files located under config/rbac.
{{#include ./getting-started/testdata/project/config/rbac/role.yaml}}
The Manager in the cmd/main.go file is responsible for managing the controllers in your application.
{{#include ./getting-started/testdata/project/cmd/main.go}}
Now that you have a better understanding of how to create your own API and controller,
let’s scaffold in this project the plugin autoupdate.kubebuilder.io/v1-alpha
so that your project can be kept up to date with the latest Kubebuilder releases scaffolding changes
and consequently adopt improvements from the ecosystem.
kubebuilder edit --plugins="autoupdate/v1-alpha"
Inspect the file .github/workflows/auto-update.yml to see how it works.
At this point you can check the steps to validate the project on the cluster by looking the steps defined in the Quick Start, see: Run It On the Cluster
Now that you have a better understanding, you might want to check out the Deploy Image Plugin. This plugin allows users to scaffold APIs/Controllers to deploy and manage an Operand (image) on the cluster. It provides scaffolds similar to the ones in this guide, along with additional features such as tests implemented for your controller.
</aside>