I’ve been playing around with Terraform a bunch recently, and I’m pretty excited about 0.4.0.
However at the moment, the examples leave quite a lot to be desired. So for my own learning and entertainment, I’ve created a set of example Terraform modules, and a simple example that you should be able to clone and run.
What’s exciting in Terraform
One of the pull requests I’ve been watching https://github.com/hashicorp/terraform/pull/1076 / etc were merged recently, fixing Tags on ASGs which I use at work for the puppet ENC, and the support for ~/.aws/credentials files is almost ready.
In fact, after chatting with folks at scalesummit on Friday, I was so excited that I wanted to pull some of the new features into my fork and play with them.
One of the most exciting strategic features to me is what’s been called ‘remote modules’, which when it’s merged will allow you to consume external Terraform state (in a read-only way) from other Terraform repositories.
This is almost exactly one of the features I’d sketched out as something Terraform needed to allow it to scale as a tool for larger organisations.
At my day job, we have multiple teams managing different parts of the infrastructure – for example one team is responsible for VPCs and DNS/puppet masters etc, whilst another team is responsible for kafka/zookeeper clusters, and several other teams all having independently managed Elasticsearch clusters.
Ergo the ability to ‘publish’ state between teams and thus allowing different teams to share the basics like VPCs and subnets) whilst having their own configs is essential for the variety of workflows and machine lifecycles that I need to support.
I haven’t yet tested out this functionality, as I got distracted making some example ‘base infrastructure’ modules that I wanted to consume data from.
Unreleased software note!
You need to use my fork of Terraform for the examples below to work, but I expect them to work without any significant changes in 0.4.0
Terraform modules
I’ve scratched my head about how to do Terraform modules (which need to lookup external data) for a while, and I’ve come up with a pattern that I don’t hate.
I’ve also written some public modules that are on github, which you can look at and criticise. (Sorry in advance for the terrible ruby)
Whilst I haven’t yet used the feature I merged, the state from the example in this post is a semi-realistic example VPC that I’ll be able to use for further testing.
To give you some insight into how I’ve put things together, I’m going to dive into each of the modules and explain a little about them, before showing how I wrap them up together with the actual code/config to launch a VPC.
Looking up AMIs
Lets think about how we should lookup a Ubuntu AMI using a module.
There’s a giant table of AMIs available on the web, and the data for it is almost, but not quite JSON you can parse.
So I wrote a getvariables.rb script, a Makefile which generates variables.tf.json – end result, we have a giant hash table of all the AMIs for Ubuntu in all of the regions.
We can then use a combination of variables and the lookup + format functions in the main.tf to output the desired AMI.
Then we just use the module, supplying it the params it needs, and we get the right AMI.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
What availability zones?
Next example – in AWS, which availability zones you’re given access to depends on your account.
Therefore, to provide a generic template that anyone can use to launch a VPC (in any region), we need to be able to detect which regions and availability zones a user has access to.
Using almost the same pattern, and the aws cli tool, I’ve produced another module which will read your ~/.aws/credentials file to pull in a list of all your available availability zones.
Due to the way Terraform currently handles things, this module just exports three variables, a primary, secondary, and (where available) tertiary availability zone (ordered by alphabetical sort).
1 2 3 4 5 6 7 8 9 |
|
Note that for this to work, you will need a [demo] section in your ~/.aws/credentials file, as the variables.tf.json file is not committed to the repository, unlike the last example.
Lets build some damn infrastructure already!
We’ve now got the pieces we need to put together a VPC, so lets go ahead and write a module to do that.
This takes a /16 network and builds out a VPC with public (internet IP), private (fixed address) and ephemeral (ASG/ELB) subnets in two availability zones.
Note that we have a lot of outputs here, as we need to expose every piece of infastructure we want to be able to reference to the module’s users.
But I meant an actual machine!
That’s the next step, we want to launch our first NAT machine, and again, we’ll write a module for it!
We use cloud-config to setup a firewall on the initial machine, so that subsequent machines (without public IP addresses) can NAT externally, and we reset the main routing table (used by the ‘back’ and ‘ephemeral’ subnets created by the VPC module) to be one which points to the new NAT instance. For good measure we also install Docker and puppet.
Note that we have to use the remote_exec provisioner here to wait for cloud-init to finish, so that we know the firewall rules are in place for NAT before we launch any more machines
So when do we get to do something I can actually run?
Ok, lets put the VPC + nat machine module to work, and build a host inside the private subnets.
You can (and should!) fork and clone this example, which contains just enough stuff to string the modules we’ve written together and bring up a couple of hosts.
You’ll need:
- my fork of Terraform
- An ~/.aws/credentials file with a [demo] section
Lets go:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
|
This has setup an ssh key to allow you to log into the created instances.
Now, change directory into the region+account folder (Terraform can only do one region at once currently), and run make again. This pulls in all the necessary modules (using terraform get), and then runs their Makefiles by iterating over .terraform/modules to ensure that variables.tf.json is built it needed.
Note that even modules which don’t need to build variables.tf.json are required to have a Makefile (which can do nothing).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
|
You should be able to ssh to your nat instance:
ssh-add ../id_rsa
make sshnat
and from there, ssh into your back network host:
ssh 10.1.10.4
and you should see consul running (in Docker of course)
1 2 3 4 5 6 7 8 9 10 11 |
|
Reusing
As most of the logic is in modules, to create another VPC in another environment or account, the top level directory can just be copied, the terraform.tfstate file removed and the details in terraform.tfvars updated.
Conclusion
Whilst a bunch of stuff in the current demo is more simplistic than a real infrastructure, it shows that Terraform can be used to bootstrap entire VPCs with non-trivial configurations.
I haven’t (yet) installed anything ‘real’ other than a docker container and some iptables rules as cloud-init is a terrible config management mechanism! However the current example should be useful as a platform for jumping off from, either on top of pre-existing infrastructre and puppet masters/chef servers (add vpc peering or vpns), or with ssh based config management (ansible, salt).
What’s next?
In the next post (or rather, in my next set of hacking that I may or may not write up ;), I plan to use the infrastructure details of this subnet to test out the remote module feaure, try to get a puppetmaster (with EC2 tags as the ENC) up, and explore storing the state in consul.