Back to Cert Manager

Design: More Modules

design/20230302.gomod.md

1.20.218.1 KB
Original Source

Design: More Modules

NB: This design doc follows from a Hackathon by @SgtCoDFish and @inteon.

The intention here is to describe what we did and what we discovered, with an eye to seeking consensus and merging upstream.

In Short

Assumptions / Axioms

  • It's hard or impossible to upgrade our dependencies months after a release
  • We won't change our conservative approach to backports
  • The fewer dependencies a go module has, the easier it is to maintain
  • It's OK if people can't import our binaries as go modules

Solution

  • Create a go module for each binary
  • Create go modules for integration and e2e tests
  • Utilise local replace statements where possible
    • i.e. Binaries have a local replace for the core cert-manager module
    • This breaks imports of those binaries but means changes only require one PR
  • We call github.com/cert-manager/cert-manager the core module
  • We call all other new modules secondary modules

Pros

  • Each binary can be patched independently

    • Side effects of a patch are limited to one binary when only that binary has the dependency
      • For example, consider updating Helm before go module proliferation
      • Updating the Helm version alone won't affect anything which doesn't import Helm
      • But: Updating Helm also brings in Helm's updated dependencies which would affect other binaries
      • E.g., we and Helm depend on the k8s libraries
      • That means that bumping Helm forces a bump of all k8s APIs for all binaries
      • With proliferation, bumping Helm would still bump the k8s libraries - but only for cmctl!
    • This includes forking a dependency or needing to replace one
    • In summary: Proliferation gives us more control over our own destiny
  • Core go.mod dependencies are reduced

    • All importers of github.com/cert-manager/cert-manager have fewer transitive dependencies
    • Reduced chance of dependency conflicts for all importers
      • Including us - in our subprojects!
    • Many people need to import cert-manager! (pkg/apis, etc).
    • We might split things more in the future - this is a good first step
  • Lays the groundwork for further splitting out binaries / packages

    • This is the start of what we'll do if we want cmctl to be its own repo
    • Or splitting pkg/apis into a separate module
    • Or splitting issuers into a module (to isolate cloud SDK dependencies)

Cons

  • Using local replace statements for binaries will break external importers of those binaries

    • We assume this won't be too destructive in most cases (since we don't see many importers of those binaries)
    • If we need to make binaries importable again, we can change them to use regular import statements
    • That would require two PRs in the event that we need to change the secondary module and the core module at the same time
    • If the secondary module would've ended up in a separate repo anyway (e.g., cmctl) we'd have done this eventually
  • Increased complexity in working with the codebase

    • E.g., go test ./... no longer tests everything, since it won't recurse into modules
    • This can be alleviated with some Makefile work - make test can still test everything
    • Go Workspaces (go.work) can also help in development environments to make things simpler

Longer Form Problem Statement

In short: Some of our dependencies are complex which makes them hard to upgrade in already-released versions

Upgrading the dependencies of even simple Go projects can be tricky and for a more complex project like cert-manager it can be impossible to upgrade dependencies for older releases while satisfying all constraints that we place on ourselves as maintainers.

In our case, these constraints are to:

  1. Minimise / eliminate CVE reports for any supported release of cert-manager
  2. Be conservative about upgrades, and avoid major version bumps in already-released software

Since we have one go.mod file for all of our built binaries, it's not possible for us to be selective about upgrades, either. If, say, only the controller component were to report as having a critical vulnerability, we'd have no way of fixing only that one vulnerability while leaving everything else untouched.

Essentially, our current project layout forces us to make difficult choices whenever we need to upgrade things.

Problem Example

In short: An example of how upgrades can be particularly difficult in some cases, with no good options.

At the time of writing, cert-manager 1.10 is still in support and depends on Helm because it's imported by cmctl (and only cmctl). We can see the dependency in go.mod.

There's a vulnerability reported for Helm v3.10.3 ([1]) which we'd like to patch, but the only version with a fix available is Helm v3.11.1.

Between Helm 3.10 and 3.11, several of Helm's dependencies were upgraded, and crucially Helm has some of the same dependencies that cert-manager does. That means that we can't easily just upgrade Helm.

Running go get -u helm.sh/helm/v3 produces 56 different upgrades of other dependencies. Notably, it bumps our Kubernetes dependencies from v0.25.2 to v0.26.0 but there are several other changes.

(NB: Helm is just an example here and we could have problems with any package)

Proposed Solution: Go Module Proliferation

In short: Add several new go.mod files so individual components can be patched independently

We can create several new Go modules so that each binary we build can have distinct dependencies. This would mean that cmctl having a dependency on Helm would only affect cmctl and wouldn't force us to change any of the other components we build in order to patch a Helm vulnerability.

Plus, where we have testing-only dependencies (e.g., for integration or end-to-end tests) we could create a test module so that those test dependencies don't pollute the main go.mod.

Terminology

Currently cert-manager has one module name: github.com/cert-manager/cert-manager. This import path is widely used and we can't break imports of this module. We'll call this the "core" module.

This proposal also introduces several new modules which depend on the core module. We'll call these "secondary" modules.

Solution Detail

First, we'll add a go.mod file for each binary we ship under cmd/ - acmesolver, cainjector, controller, ctl and webhook.

These new modules should resolve to having identical dependencies to what they currently have (i.e. we shouldn't bump any versions at this stage).

text
cmd
├── acmesolver
│   ├── ...
│   ├── go.mod
│   ├── go.sum
│   ├── main.go
├── cainjector
│   ├── ...
│   ├── go.mod
│   ├── go.sum
│   └── main.go
├── controller
│   ├── ...
│   ├── go.mod
│   ├── go.sum
│   └── main.go
├── ctl
│   ├── go.mod
│   ├── go.sum
│   ├── main.go
│   └── ...
└── webhook
    ├── go.mod
    ├── go.sum
    ├── main.go
    └── ...

These changes will also require tweaks to how modules are built and tested, which will be done in our Makefile.

After these changes, running go mod tidy on the core cert-manager module should clean a lot of dependencies but will leave many SDKs since they're depended on by issuer logic which is in pkg/.

As part of this process there will be several import paths which will need to be fixed, but nothing should break.

Workflow Example: Changing a Binary

NB: See Importing cert-manager / Development Experience below for an exploration of the problems we face here and reasoning behind the proposed solution.

As an example of the kind of change being discussed, imagine adding a new field to our CRDs along with a feature gate. This would require changes both to at least one secondary module (e.g., the controller) and to the core cert-manager module.

In order to avoid having to make two PRs for this kind of change we propose to explicitly state that any external import of the new modules under cmd is not supported. By breaking this kind of external import, we can use the replace directive in the new go.mod files for each of the binaries to refer to the cert-manager package in the same repository.

This means that every change to pkg/ will automatically be picked up by all of the binaries that we build and test in CI.

An example of the replace directive is given below:

gomod
module github.com/cert-manager/cert-manager/controller-binary

go 1.19

replace github.com/cert-manager/cert-manager => ../../

require (
    github.com/cert-manager/cert-manager v0.0.0-00010101000000-000000000000
	...
)

To be clear: using replace directives like this will break anyone who tries to import the github.com/cert-manager/cert-manager/controller-binary module or anyone who was previously importing github.com/cert-manager/cert-manager/cmd/controller before this proposal.

Potential Issues

Importing cert-manager / Development Experience

In short: Module replacements help developers but aren't respected by imports, meaning some changes could need two PRs or we'd have to break anyone importing certain modules

Useful Reference: It helps to read this StackOverflow comment to better understand the options we have

The simplest development experience when working with multiple Go modules at once is to use either the replace directive in go.mod or the use directive in go.work to point to local versions of a module. This allows both modules to be developed in parallel on a local machine.

For modules which we don't think should ever be imported by third parties, replace directives would work so that those modules always use the version of cert-manager which is at the same commit as those modules.

For example we could look at the controller component which would depend on the core cert-manager module. Its go.mod might look like the example given above under "Workflow Example: Changing a Binary".

An issue with this approach is that the replace statement wouldn't be respected if anyone imports the controller module from a 3rd party project. Instead, that 3rd party would see an error relating to an unknown version of cert-manager.

For this example involving cmd/controller it might well be acceptable for us to break 3rd party imports but for other modules that might not be reasonable. In that case, we'll always have a fallback; using a 'regular' import of the core module.

This would mean that we create two PRs for a change; the first changes the core module, and the second updates the secondary module to import the new core module version created by the previous PR.

UPDATE: As we implemented this design, it was decided that we didn't want to break imports of cmctl because it was used in several other cert-manager subprojects, so cmctl uses the approach described above.

Potential Solution for Developer Experience: Dynamic go.work

We could introduce a make target which generates one or more go.work files locally to point all modules at local development versions. This doesn't help with having to raise two PRs for a change, but it does help minimise the burden of testing changes locally.

This could mean that users won't notice if they forget to bump their go.mod files to point at a new release of the core module - but tests should fail in CI to alert them of this problem.

Running Tests

In short: Multiple modules in one repo break go test ./...

Part of the migration to Make enabled the use of go test for testing. Under the hood, our make targets essentially use go test themselves.

The issue is that go test won't recurse into other modules. If we make cmd/controller a separate module, then go test ./pkg/... ./cmd/... won't run any of the tests in cmd/controller. Any existing uses of go test ./... which intend to test everything will silently start to not test everything.

This can be mitigated by leaning more heavily on make; we can have make test run the tests for every module. It's a shame to lose the ability to test everything with go test in this way, but the tradeoff ultimately seems worth it.

Test Modules

In short: The test/ directory could (should) be a module but part of it is imported elsewhere.

The test/ directory at the root of the cert-manager repo today has several purposes.

test/unit provides a library which is imported by several other packages, to aid with setting up data for unit tests. For example, pkg/controller/certificatesigningrequests/ca/ca_test.go imports the test/unit/gen package to aid with generating test data. test/internal has similar content to test/unit, but focusing more on utility functions.

test/integration and test/e2e implement actual tests which are designed to run against cert-manager but which don't fit under the category of unit tests. These test directories have external dependencies including on cloudflare-go and the Hashicorp Vault API along with imports for the cmctl and cert-manager webhook code.

Essentially, the test/ directory has both actual tests and test utility code. The actual tests import several areas of cert-manager which become external modules under these proposals, and the utility code is imported by the core cert-manager module.

Solution: Split Test Code

Since there are two types of code in test/, we can split it.

There are known external importers of test/unit/ which means it's difficult to move that without breaking people.

As such, we could move test/e2e and test/integration or we could make them both independent modules and keep them where they are.

The diff for the main repository go.mod after separating out the tests is presented in footnote [2].

Increased Time to Patch Everything

Having multiple go.mod files will mean that when we share a dependency across many components (such as the Kubernetes libraries) we'll have to update multiple files rather than just one. Alternatively, if we update a dependency for the core go.mod file we'll maybe want to also update every other go.mod which imports that one.

Other Considered Approaches

Being Less Conservative

The main issue we face with upgrading older versions of cert-manager is that we self-impose strict conservatism when it comes to any kind of backport. In this view, any change for any reason is inherently seen as bad and to be avoided, even if that change has no runtime impact for users.

We don't need to do this. While we wouldn't seek to make major version upgrades in backports just for the fun of it, we could choose to accept a larger subset of backports and rely on our tests to confirm that the change is sound.

This doesn't solve the problems of allowing independent control over the dependencies of different binaries, though, and doesn't reduce the attack surface of any of our components.

Aggressively Reducing Dependencies

Rather than isolating dependencies, we could remove them by, e.g., vendoring subsets of their code into our repo. This gives us a huge amount of control and allows us to preserve backwards compatibility very easily.

It also creates a huge burden for us to maintain that vendored code, which is a drawback. We'd still have to track e.g., Helm to see if there are any relevant vulnerabilities reported, and then we'd have to go and actually fix them ourselves. If upstream code diverged significantly we might be left on our own trying to work out how to fix bugs - or even trying to work out if we even have a bug.

There's probably some low hanging fruit we could pick here, but we're unlikely to be able to fully remove a big chunk of our dependencies. That means the problem won't go away - and there's always the chance that we need to add new dependencies down the road.

Addendum: Groundwork for pkg/apis Module

We've talked before about creating a separate pkg/apis module or repo to improve the experience for users who need to import that specific path (which is common).

Module proliferation could be a solution here by making that path a new module.

Changing the pkg/apis module isn't really related to reducing dependencies so it's a little different to the rest of this proposal and we don't propose to do it as part of this work. But the implementation of this design might inform how we could approach solving the pkg/apis problem.

Footnotes

[1] cert-manager likely isn't actively vulnerable to these specific Helm CVEs, but it's easy to imagine something being reported which it's actually vulnerable to and which we'd have to upgrade.

[2] The diff from separating integration and e2e tests into their own modules:

diff
diff --git a/go.mod b/go.mod
index c95d5fbe3..ef3fcfc64 100644
--- a/go.mod
+++ b/go.mod
@@ -10,7 +10,6 @@ require (
    github.com/Venafi/vcert/v4 v4.0.0-00010101000000-000000000000
    github.com/akamai/AkamaiOPEN-edgegrid-golang v1.2.2
    github.com/aws/aws-sdk-go v1.44.179
-   github.com/cloudflare/cloudflare-go v0.58.1
    github.com/cpu/goacmedns v0.1.1
    github.com/digitalocean/godo v1.93.0
    github.com/go-ldap/ldap/v3 v3.4.4
@@ -22,14 +21,10 @@ require (
    github.com/kr/pretty v0.3.1
    github.com/miekg/dns v1.1.50
    github.com/mitchellh/go-homedir v1.1.0
-   github.com/munnerz/crd-schema-fuzz v1.0.0
    github.com/onsi/ginkgo/v2 v2.7.0
-   github.com/onsi/gomega v1.24.2
    github.com/pavlo-v-chernykh/keystore-go/v4 v4.4.1
    github.com/pkg/errors v0.9.1
    github.com/prometheus/client_golang v1.14.0
-   github.com/segmentio/encoding v0.3.6
-   github.com/sergi/go-diff v1.3.1
    github.com/spf13/cobra v1.6.1
    github.com/spf13/pflag v1.0.5
    github.com/stretchr/testify v1.8.1
@@ -203,7 +198,7 @@ require (
    github.com/rubenv/sql-migrate v1.2.0 // indirect
    github.com/russross/blackfriday/v2 v2.1.0 // indirect
    github.com/ryanuber/go-glob v1.0.0 // indirect
-   github.com/segmentio/asm v1.1.3 // indirect
+   github.com/sergi/go-diff v1.3.1 // indirect
    github.com/shopspring/decimal v1.2.0 // indirect
    github.com/sirupsen/logrus v1.9.0 // indirect
    github.com/spf13/cast v1.4.1 // indirect