Speeding up Postgres integration tests in Go

Written 10th of June 2021

I've found a really nice way to speed up Postgres database integration tests considerably. It gave me a 3x speedup. Before, I ran database migrations setup and teardown before and after every integration test, using a migrate tool and a long-running Docker database container.

The solution involves using the Postgres template database functionality. The code examples in this post are going to be in Go, but this method applies to any other language where you're using integration tests with Postgres.

Before, it was slow

In a large project I'm currently working on, we used to use golang-migrate to setup and teardown database tables before and after every integration test. This makes it really easy to make sure that you have a clean database state before every test run, to ensure test isolation.

However, it wasn't free: the setup of our 30+ table migration files took around 100 ms, and about the same for the teardown, adding about a fifth of a second for every integration test. This quickly adds up, and we ended in a situation where testing our main storage package on our beefed up development machines took 30 seconds, and much more on the loaded CI machines. This doesn't sound like much, but when you like to test early and often, it quickly becomes a nuisance.

It's just very nice to do things like testing quickly, especially when you're used to having the rest of the Go toolchain be so fast about everything. There's just something about tool speed that's satisfying.

Then, it got faster

So I researched other approaches on doing this. Perhaps it would be faster to somehow merge the migrations? Run the database from RAM? Turn off fsync for Postgres or some other obscure startup flag? Some of these helped a little, but not as much as I had hoped and anticipated. Especially running the database from a RAMdisk didn't speed up the tests as much as I had thought.

But then I stumbled upon the Postgres feature of template databases. Basically, every new Postgres database is implicitly copied from a template database (called template1). Any changes you make to template1 before creating a new database is then copied to that database. And importantly, this copy is a cheap filesystem operation. Sounds perfect!

The approach

I usually have a testing helper function to setup and teardown my databases. It's called CreateDatabase, and it does something like this:

  1. Only the first time it runs: Connect to the Postgres database template1 and run the migrations with golang-migrate.
  2. Drop the testing database (let's call it testdb) if it exists, and then recreate it. This last query implicitly copies from template1.
  3. Return a database connection to the new database, along with a cleanup function.

In code, it looks something like this (with some details left out for brevity):

package integrationtest

import (
	"database/sql"
	"sync"
	"time"

	_ "github.com/jackc/pgx/v4/stdlib"
)

var initDatabaseOnce sync.Once

// CreateDatabase for testing.
// Usage:
// db, cleanup := integrationtest.CreateDatabase()
// defer cleanup()
func CreateDatabase() (*sql.DB, func()) {
	initDatabaseOnce.Do(initDatabase)

	db, cleanup := connect("postgres")
	defer cleanup()

	dropConnections(db, "template1")
	dropConnections(db, "testdb")

	if _, err := db.Exec(`drop database if exists testdb; create database testdb`); err != nil {
		panic(err)
	}

	return connect("testdb")
}

func initDatabase() {
	db, cleanup := connect("template1")
	defer cleanup()

	for _, err := db.Exec(`select 1`); err != nil; {
		time.Sleep(100 * time.Millisecond)
	}

	// Do test database migrations here
}

func connect(name string) (*sql.DB, func()) {
	db, err := sql.Open("pgx", "postgresql://test:123@localhost:5432/"+name+"?sslmode=disable")
	if err != nil {
		panic(err)
	}
	return db, func() {
		if err := db.Close(); err != nil {
			panic(err)
		}
	}
}

func dropConnections(db *sql.DB, name string) {
	query := `
		select pg_terminate_backend(pg_stat_activity.pid)
		from pg_stat_activity
		where pg_stat_activity.datname = $1 and pid <> pg_backend_pid()`
	_, err := db.Exec(query, name)
	if err != nil {
		panic(err)
	}
}

So what's going on?

First, we do the initialization using a sync.Once. You could have done this in TestMain instead, but I like having it here in a single file, so it's easy to figure out what's going on. initDatabase just connects to the template1 database and runs the migrations.

Next, we reconnect to the database, now to the default postgres database. This is so we can drop the existing testdb (from failed tests perhaps) and recreate it. We do that using drop database if exists testdb; create database testdb. You could have written create database testdb template template1 with the exact same result, and this is where we use the templating trick. The template has all the migrations set up in initDatabase, so now testdb has them, too!

As you can see, then there's a small hack in there to close existing connections that might be hanging, because otherwise we can't create a new database from the template. I'm not sure if there's a better way to do this (there probably is), but this seems to be working for me.

Finally, we connect to testdb and return the connection, along with a cleanup function that closes the connection.

The accompanying Docker compose configuration looks like this:

version: '3.8'
services:
  db:
    image: postgres:13
    environment:
      POSTGRES_USER: test
      POSTGRES_PASSWORD: 123
      POSTGRES_DB: template1

It's just a detail that I'm using Docker compose to spin up the test container. This would work equally well without Docker.

Now, it's fast

The results speak for themselves. Before:

$ go test ./storage
ok  	project/storage	33.304s

After:

go test ./storage
ok  	project/storage	9.880s

Overall, a 3x speedup, that made running our integration test suite much nicer again. An hour of investigation well spent. 😎

About me

A picture of Markus.

I'm Markus, a professional software consultant and developer. 🤓✨ You can reach me at markus@maragu.dk or @markusrgw.

I'm currently building Go courses over at golang.dk.