runatlantis.io/blog/2018/putting-the-dev-into-devops-why-your-developers-should-write-terraform-too.md
::: info This post was originally written on August 29th, 2018
Original post: https://medium.com/runatlantis/putting-the-dev-into-devops-why-your-developers-should-write-terraform-too-d3c079dfc6a8 :::
Terraform is an amazing tool for provisioning infrastructure. Terraform enables your operators to perform their work faster and more reliably.
But if only your ops team is writing Terraform, you're missing out.
Terraform is not just a tool that makes ops teams more effective. Adopting Terraform is an opportunity to turn all of your developers into operators (at least for smaller tasks). This can make your entire engineering team more effective and create a better relationship between developers and operators.
Terraform is two things. It's a language for describing infrastructure:
resource "aws_instance" "example" {
ami = "ami-2757f631"
instance_type = "t2.micro"
}
And it's a CLI tool that reads Terraform code and makes API calls to AWS (or any other cloud provider) to provision that infrastructure.
In this example, we're using the CLI to run terraform apply which will create an EC2 instance:
$ terraform apply
Terraform will perform the following actions:
# aws_instance.example
+ aws_instance.example
ami: "ami-2757f631"
instance_type: "t2.micro"
...
Plan: 1 to add, 0 to change, 0 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
aws_instance.example: Creating...
ami: "" => "ami-2757f631"
instance_type: "" => "t2.micro"
...
aws_instance.example: Still creating... (10s elapsed)
aws_instance.example: Creation complete
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
Adopting Terraform is great for your operations team's effectiveness but it doesn't change much for devs. Before Terraform adoption, devs typically interacted with an ops team like this:
After the Ops team adopts Terraform, the workflow from a dev's perspective is the same!
With Terraform, there's less of Step 2 (Dev: Waits) but apart from that, not much has changed.
If only ops is writing Terraform, your developers' experience is the same.
Developers would love to help out with operations work. They know that for small changes they should be able to do the work themselves (with a review from ops). For example:
Developers could make all of these changes because they're small and well defined. Also, previous examples of doing the same thing can guide them.
In many organizations, devs are locked out of the cloud console.
They might be locked out for good reasons:
Even if they have access, operations can be complicated:
With Terraform, everything changes. Or at least it can.
Now Devs can see in code how infrastructure is built. They can see the exact spot where security group rules are configured:
resource "aws_security_group_rule" "allow_all" {
type = "ingress"
from_port = 0
to_port = 65535
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
security_group_id = "sg-123456"
}
resource "aws_security_group_rule" "allow_office" {
...
}
Or where the size of the autoscaling group is set:
resource "aws_autoscaling_group" "asg" {
name = "my-asg"
max_size = 5
desired_capacity = 4
min_size = 2
...
}
Devs understand code (surprise!) so it's a lot easier for them to make those small changes.
Here's the new workflow:
Now:
Great! But there's another problem.
In order to execute Terraform you need to have cloud credentials! It's really hard to write Terraform without being able to run terraform init and terraform plan, for the same reason it would be hard to write code if you could never run it locally!
So are we back at square one?
Atlantis is an open source tool for running Terraform from pull requests. With Atlantis, Terraform is run on a separate server (Atlantis is self-hosted) so you don't need to give out credentials to everyone. Access is controlled through pull request approvals.
Here's what the workflow looks like:
A developer creates a pull request with their change to add a security group rule.
Atlantis automatically runs terraform plan and comments back on the pull request with the output. Now developers can fix their Terraform errors before asking for a review.
The developer pushes a new commit that fixes their error and Atlantis comments back with the valid terraform plan output. Now the developer can verify that the plan output looks good.
You'll probably want to run Atlantis with the --require-approval flag that requires pull requests to be Approved before running atlantis apply.
An operator can now come along and review the changes and the output of terraform plan. This is much faster than doing the change themselves.
To apply the changes, the developer or operator comments “atlantis apply”.
Now we've got a workflow that makes everyone happy:
terraform plan looks goodNow developers can make small operations changes and learn more about how infrastructure is built. Everyone can work more effectively and with a shared understanding that enhances collaboration.
Atlantis has been used by my previous company, Hootsuite, for over 2 years. It's used daily by 20 operators but it's also used occasionally by over 60 developers! Another company uses Atlantis to manage 600+ Terraform repos collaborated on by over 300 developers and operators.