rfd/0120-hardening-amis.md
@r0mant@reedloden && @jentfoo@xinding33 || @klizhentasThis RFD discusses structures and processes to increase the security of the Amazon EC2 Disk Images (a.k.a AMIs) used to ship Teleport to customers.
One way our customers consume Teleport is by using a pre-built and -configured AMI to run Teleport in AWS EC2 instances. As a provider of these images we are (at least partially) responsible for the security of all software included in the image, not just Teleport.
We should endeavor to not ship vulnerabilities to our clients, even if those vulnerabilities are not directly in our software.
This is not simply an academic exercise. Customers have been asking for hardened AMIs - especially for AMIs that conform to well-known security benchmarks. See here for an example.
We currently use Packer and a shell script to provision our AMIs, based on Amazon Linux 2. The same Packerfile is used as a basis for several AMIs
Inclusions for all AMIs:
certbot for ACME Certificate
Rotation (used by optional ssm systemd units)pip for installing and running certbotuuid for token generation (used by optional ssm systemd units)gcc for ???libffi-dev for ???openssl-devel for ???libfontconfig for ???Whether or not the actual services are enabled depends on post-build
actions by the eventual AMI consumer, such as enabling various systemd
units.
We currently do not test our images against a benchmark.
The two best-known standards for security hardening are
Both of these standards are pretty comprehensive and have wide application, although private organizations tend to favor CIS over STIG.
So why choose STIG over CIS?
The decision largely comes down to what tooling is immediately available. Both standards are comprehensive - the AL2 CIS benchmark document runs to about 600 pages. At an estimated 2 pages per item, that's 300 separate items that need detection, remediation and validation. I don't believe that Teleport wants to maintain scripts to do that ourselves, so I have been focused on finding tools to do it for us.
Somewhat unsurprisingly, nearly all of the automated AMI hardening tools I have found are tied to the AWS EC2 Image Builder Tool. Both standards are represented, but while the STIG-compliance components are free for use on any image, the CIS tooling has several drawbacks:
The decision to go with STIG is based on being able to improve the situation for our customers now, while still being able to add CIS hardening tools as we develop that relationship in the future.
Both the CIS Benchmarks and STIG e es
My proposed solution is to use AWS EC2 Image builder, using the available STIG tooling, to create a hardened base image.
We can do this as part of the build process, or on some frequent schedule (e.g. daily, weekly).
I suggest the latter, scheduled option;
As mentioned above, there is no way to inject parameters into an EC2 ImageBuilder
pipeline. The pipeline is the pipeline. There is some customization available at
the build component level, but there is no way (that I can see, at least) to
say something like "Trigger this pipeline using Teleport vX.Y.Z and save the
resulting image as teleport-oss-X.Y.Z". All you can do is trigger the pipeline
and have it produce (or replace) a single image.
So this brings us to an obvious division of labor:
There is a side benefit here: were we to move the entire process into EC2 Image
Builder, it would require moving a lot of Teleport-specific code and configuration
out of the teleport Git repositories and into teleport-cloud terraform.
From experience, keeping Teleport-specific things closer to teleport is generally
a better idea than splitting them apart.
In the spirit of moving towards distroless OCI container images, I propose slimming the contents of the AMIs in order to reduce their attack surface.
teleport-prodThese new images will be published to the teleport-prod AWS account. This will
partially obviate the need to move the legacy images from the gravitational
AWS account.
As per the existing AMI builds, AMIs will be constructed as part of the tag build, and made public on promotion.
While the CIS-hardening tooling on Amazon requires royalties and subscriptions, the Amazon Inspector tool contains CIS benchmarks appears to be royalty free.
Amazon Inspector can't examine AMIs directly, but it is possible to
This idea is based on a Continuous Vulnerability Assessment example from the AWS security blog, minus the scheduled lambda to trigger the assessment.
We should mirror our approach for OCI container Images (See RFD-0112), which boils down to:
teleport.e GitHub Security Issues
tabThe trivy scanner is known to support scanning AMIs,
and we already it for OCI container images and other resources. We should
use it for our AMIs as well.
There is a default 1114 limit of public AMIs allowed in an AWS region. While this can be raised on request, it seems reasonable to periodically clean up our public images.
Proposed deletion criteria:
teleport-prodI expect that Phases 2-4 can be performed in any order, even in parallel, depending on what is the highest value target.
Changing the contents and configuration of the shipped AMIs is considered a compatibility breaking change, and so needs to be handled with some care.
We will be rolling out the images over 3 major releases of Teleport:
teleport-prod in
parallel with legacy images in gravitational account.