Terraform + yaml = ❤️

Terraform + yaml = ❤️

Managing Infrastructure using terraform and yaml

Managing infrastructure on the cloud these days can be done in numerous ways, tools like Terraform, Pulumi, Cloudformation, and CDK have propelled the area of Infrastructure as Code(IaC) into new heights. While most of these tools require some basic programming knowledge or even learning new languages, like Terraform HCL, the true power lies in building a platform using these tools.

When building infrastructure, as infrastructure developers we tend to create it in ways where the author can create a new infrastructure easily, but not the remainder of the organization. This usually happens when as engineers we think the tool is built only for "us", however this can be avoided by thinking about infrastructure as a product where our customer is the engineering team as a whole, this helps in ensuring anyone can build new infrastructure easily without prior knowledge about the tools being used.

This was one of the goals for my team at Welcome (formerly Newscred) when I first joined. Prior to me joining, the team had built an in house solution using aws cli and boto3, while this worked it was missing key features like state management, allowing quick disaster recovery, and maintainability. We decided to use terraform as our tool in the end and in this article I will be covering how we used Terraform and yaml files to be in sync with our in house solutions.

Why Yaml?

Our in-house solution maintains major resources like security groups and iam roles using yaml for configuration as code.

My initial thoughts to migrate was to use Terraform with .tfvars files to maintain our configurations. However, as soon as we wrote our initial configurations, there was a large unreadability issue and an increasing complexity of duplicate configurations stored in different formats, to ensure we keep a single source of truth we made the choice of implementing our existing pattern of using yaml configurations for our migration.

Implementation

Note: The implementation was written using terraform version 0.14.6, and has not been tested on 1.0.0 yet.

The following implementation uses a lot of advanced tools of terraform like for_each, lookup and flatten. It is highly recommended to go through the docs to get a better understanding of how it works.

Setting up the yaml files

Reading the Yaml files

yamldecode -  helps in formatting your yaml file into a map object that terraform can read from.

flatten -  helps in restructuring nested maps into a more readable map that is easier to access by terraform functions.

Creating all resources

Based on the configurations above, we can now create n sqs queues just by adding new configurations in the yaml file. The following file helps in doing that, using for_each

for queue in local.sqs_standard_queues : queue.name => queue

The above statement iterates through our list of flattened queues and maps them to a key value pair. In our scenario the key is the name of the queue, and the value is the map object.

"production-example-queue-dlq":   {
    "access_policy": "basic"
    "dlq": null
    "name": "production-example-queue-dlq"
    "type": "standard"
}

for_each -  Iterate through each key in the map generated above and creates a resource as shown below in the plan.

aws_sqs_queue.sqs_standard_queues["production-example-queue-dlq"]

Note: The above statement is also how we need to reference the queue in a different resource

if - Helps to condense the list based on meeting the criteria if the key dlq exists or not. each.value.* - each references to the key. value references to the value of the key and the * can be any of the keys that we set in our locals.

Debugging Tips

Terraform has a lot of useful functions, but sometimes it becomes hard to debug situations with complex maps. In order to debug you can use the terraform console . This helps in calling your local resources and seeing the map.

Example in order to debug the above example.

terraform console
> local.sqs_queues #prints out the yaml file decoded
> local.sqs_standard_queues #prints out the flattened object

Conclusion

Using only terraform limits us to writing configurations in .tfvars files to abstract away complexity from our infrastructure users, which in turn introduces a burden on our users to understand how terraforms language works. By leveraging yaml for configuration as code, as our user interface, we empower our infrastructure users to easily create new resources and stacks using a language they are already familiar with.

This will allow the larger engineering team to bring up services quickly and with less wait times. We have already implemented this for our standalone AWS services successfully, and are currently in the process of migrating our more complex stacks like EKS clusters using Terraform + Yaml.

Special mention to Pratik Saha who had figured out how to convert yaml files into terraform objects.