t0m's very occasional blog

Sufficiently advanced trolling is indistinguishable from thought leadership

DIY Scalable PAAS With Terraform

I’ve been ranting about Terraform a lot recently, and I’ve shown some pretty neat examples of building a VPC – however before now I haven’t got anything actually useful running.

In this post, we’re gonna change that, and launch ourselves a publicly available and scalable PAAS (Platform As A Service), built around Apache Mesos and Marathon with nginx and Amazon’s ELB for load balancing and Route53 for DNS.

Terraform From the Ground Up

I’ve been playing around with Terraform a bunch recently, and I’m pretty excited about 0.4.0.

However at the moment, the examples leave quite a lot to be desired. So for my own learning and entertainment, I’ve created a set of example Terraform modules, and a simple example that you should be able to clone and run.

More Talks

I’m still rubbish at remembeing to (or finding the time to) blog, however here’s some more video links:


Me, in Video Form

I’ve been a massive fail at blogging recently, as I’ve been way too busy doing cool stuff :)

However, I have talked at a whole bunch of conferences and tech meetups in the last 12 months, here’s a quick run-down of where’s I’ve been for posterity:

Updates since the original post

Original list

We Have Always Been at War With Eastasia

Rewriting history when doing an svn to git migration is always fun. Especially when you start rewriting the entire contents of files.

Take this ditty for instance:

git filter-branch -f —tree-filter ‘for j in $(for i in $(find . -name *.pp | grep -v vendor | grep -v tmp); do cat $i |perl -ne”/(\s+)\S/&&\$e{length(\$1)}++;END {exit 0 if \$e{2}; exit 2}“|| echo $i; done); do cat $j | perl -ne”/^(\s+)/;\$s=length(\$1)/2;\$s=\“ "x\$s;s/^\s+/\$s/;print”>“$j.new” && mv “$j.new” $j; done;find . -name *.pp | grep -v vendor | grep -v tmp | xargs puppet-lint —fix >/dev/null ||true’ HEAD

This switches my puppet codebase to the canonical 2 space (rather than 4 space) tabs and fixes lint/quoting issues – all the way back through history.

I, for one, welcome our newold code layout overlords.

Read Only Bind Mounts and Docker for Unix Domain Sockets

I’ve been playing around with Docker a load recently, and I’ve developed a nice little pattern for sharing sockets that I’m gonna pass on. (By sockets, I specifically mean Unix domain sockets) for the purposes of this post.

A lot of applications can either use a TCP socket, or a Unix domain socket to communicate – for example FastCGI and http://www.mysql.com/ both allow you to use either mode (and in the case of mysql, both).

I have a small home server, and a want to run a bunch of applications that inside containers (for security and management reasons), and these applications need to speak to a mysql server (via a unix domain socket – which just appears to be a file on the filesystem.

I also want to run the mysql server inside a container – so the mechanics of getting a socket shared between them are a little non-trivial.

Lets go through a worked example of how I’ve solved this for the dovecot imap and pop3 server talking to mysql. This is an especially fun example, as dovecot runs as root (within it’s container) so if someone hacked into the server through an exploit in dovecot – they can rm -rf anything I share with that container…

First off, I’ve created some LVM volumes:

mysql vg0 -wi-ao 4.00g

This is the mysql data directory, which will be mounted in the mysql container as /var/lib/mysql

mysql_socket vg0 -wi-ao 4.00m

This is going to get mounted at /socket/mysql inside containers, and will hold the mysql Unix domain socket

vmail vg0 -wi-ao 20.00g

And this is my mail spool for dovecot with all the emails in it.

The immediate problem with this is that if the mysql socket volume is shared between multiple other containers (as dovecot won’t be the only app using this mysql instance), as dovecot runs as ‘root’, then if it gets hacked, the hacker can delete the mysql socket, and any other programs trying to connect to mysql through that socket will be affected.

We can’t allow that – that would defeat the entire point of using containers for application isolation!

The trick is that you only need read access to use a unix domain socket that some other program has created.

Therefore, we can use bind mounts to fix this – by re-binding a readonly (-o ro) copy of the file system, and giving that to dovecot would stop these issues, easy…

So, our partitions get mounted like this:

/dev/mapper/vg0-vmail on /mnt/volumes/vmail type ext3 (rw,noatime) /dev/mapper/vg0-mysql on /mnt/volumes/mysql type ext3 (rw,noatime) /dev/mapper/vg0-mysql_socket on /mnt/volumes/mysql_socket type ext3 (rw,noatime) /mnt/volumes/mysql_socket on /mnt/volumes_ro/mysql_socket type none (ro,bind)

The last line involves a little trick. When I tried this, to my dismay, it didn’t work.

The /etc/fstab entry associated with it is:

/mnt/volumes/mysql_socket /mnt/volumes_ro/mysql_socket none bind,ro 00 00

But as you can see from the Googles, read only bind mounts are a bit tricky – you have to re-mount them a second time to make them read-only.

The trick I use here is twofold – one, I only create them with puppet (mostly using the excelent mounts module (which I only had to patch a little bit), the associated code looks like this:

1
2
3
4
5
6
7
8
9
10
11
  mounts { "Mount volume ${name} bind to ro":
    ensure => present,
    source => "/mnt/volumes/${name}",
    dest   => "/mnt/volumes_ro/${name}",
    type   => 'none',
    opts   => 'bind,ro', # Note bind mount ignores ro till you remount, thus the trickery below
  }
  ->
  exec { "/bin/mount -o remount,ro '/mnt/volumes_ro/${name}'":
    onlyif => "/bin/mount -l | /bin/grep '/mnt/volumes/${name} on /mnt/volumes_ro/${name} type none (rw,bind)'"
  }

This could be notify rather than just ordering – but I like to be paranoiad and check these are ok every puppet run..

However, if the machine is freshly rebooted, then the read-only file systems won’t be remounted yet, ergo the second piece of cunning is an upstart script to make sure things as kosher after a reboot:

1
2
3
4
5
6
7
8
9
10
11
# /etc/init/remount-ro-bind-mounts.conf - Dirty hack to remount ro bind mounts properly

description "Remount RO bind mount filesystems on boot"

start on local-filesystems

console output

script
  for fs in $(cat /etc/fstab | awk '{ print $2 " " $4 }' | grep 'bind,ro$' | cut -d' ' -f1); do mount | grep rw,bind | awk '{ print $3 }' | grep $fs >/dev/null; if [ $? -eq 0 ]; then mount -o remount,ro $fs; fi; done
end script

And there we go – all done(-ish). I just rsync the vmail spool and mysql data from another machine, and start it all up!

The containers look like this:

1
2
3
4
$ sudo docker ps
ID                  IMAGE                   COMMAND             CREATED             STATUS              PORTS
04a9da050ba8        t0m/dovecot:latest      /start run          25 minutes ago      Up 25 minutes       110->110, 143->143, 993->993, 995->995
20b3a7a65966        t0m/mysql:latest        /start run          About an hour ago   Up About an hour

And they get run with the upstart script from Garth’s excelent docker puppet module:

1
2
3
$ ps aux | grep 'docker run' | grep -v grep
root     16874  0.0  0.0 273744  4944 ?        Ssl  Oct06   0:00 docker run -u mysql -v /mnt/volumes/mysql/data:/var/lib/mysql -v /mnt/volumes/mysql_socket:/socket -m 0 t0m/mysql:latest
root     28498  0.0  0.0 272336  4944 ?        Ssl  Oct06   0:00 docker run -v /mnt/volumes/vmail:/var/vmail -v /mnt/volumes_ro/mysql_socket/run:/socket/mysql -m 0 t0m/dovecot:latest

And last but not least, here’s it working:

1
2
3
4
5
6
7
8
9
10
11
12
13
$ openssl s_client -quiet -connect localhost:995
depth=0 C = GB, ST = Greater London, L = London, O = bobtfish.net, CN = mail.bobtfish.net
verify error:num=18:self signed certificate
verify return:1
depth=0 C = GB, ST = Greater London, L = London, O = bobtfish.net, CN = mail.bobtfish.net
verify return:1
+OK Dovecot ready.
USER bobtfish@bobtfish.net
+OK
PASS xxxxxx
+OK Logged in.
STAT
+OK 488 15003687

Next up, doing the same trickery for postfix, and then I’ll have a working mail infrastructure that’s entirely containerised.

The Starwars Methodology

This seemingly isn’t a well known term, and today I’m irritated by people thinking I perform ‘magic’ by using grep, so here’s a rant about what I call the “Star Wars” ™ methodology of problem solving:

To put it quite simply:

Use the Source Luke

Oftentimes grepping through the source code of the thing that’s giving you trouble (even if it’s written in a language you don’t speak) will turn up gold.

Given a basic model of OO languages, it’s easily possible to infer what’s going on (or what may be going on) most of the time given basic observations and a few print statements, or even a file and line number.

Getting a full backtrace out of most dynamic languages is real easy too, even in extreme cases..

Even if you don’t know for sure, some source diving will at least help give you more concrete ideas about how the code is built, which will allow you to frame the problem better when you do peresent it to someone with the relevant language and/or domain knowledge.

Even if your assumptions were totally wrong – at at least look more initiative (and tried to go deeper) than your average Joe, so the expert is more likely to want to help you, as you’re already demonstrated a provable desire to be taught to fish, rather than just be thrown them.

</rant>

New Blog.

Finally got around to creating a personal blog in which to ramble about stuff I’m thinking about and/or working on.

Expect perl, ruby, puppet, mcollective, sysadmin nonsense & arduinos if you’re lucky.