We have a collection of nearly 700 integration tests (and growing!) in over 130 suites that we run against a live MySQL database. These tests are mostly performing GraphQL mutations, and the tests basically just assert that the mutation does what it’s meant to do.

As time goes on and we add more and more tests, the test suite was becoming slower and slower…

Resetting the DB between tests

To ensure tests aren’t too fragile, early on we added a beforeAll routine that reset the database between each suite.

To reset the database, we were basically doing this:

# reset.sh
mysql -e 'DROP DATABASE test;'
mysql -e 'CREATE DATABASE test;'
mysql test < fixtures.sql

This is super simple and just works, but it’s a pretty heavy operation. Our product has over 400 tables, and our set of fixture data is quite large because the product is a data-intensive app and it’s just a big product. Over time, as one might expect, the tests became slower and slower, and the load on our CI server just kept climbing.

So at the point where it was taking ~20 mins to run the suite over 4 workers, I started thinking about ways to improve performance.

Ideas

There’s lots of writing on this you can find via Google and we had some of our own ideas flying around too. Here’s a list of a few things we thought about:

  • Building a complete docker container once per run, database and all, and then just spin up the container once for each suite.
  • Attempt to find a way to run tests within transactions, so they could just be rolled-back at the end. (This is pretty common practice, but in our case, was impossible.)
  • Just… don’t reset after every suite. Or maybe group tests in a way where they won’t interfere with eachother. (We thought this would be too error prone and fragile.)
  • Write tests that set up their own fixtures. i.e. write them in such a way where it doesn’t matter what was run before them. (We thought this would be tedious; we already put a lot of effort in our dev seed dataset.)
  • And probably a few more that I’ve forgotten to mention.

All of these have their pros/cons. Ultimately though I was searching for a bit of a golden ticket. I wanted something that I could whip up fast (didn’t require loads of effort/refactoring to existing tests), and something I could deploy fast (didn’t require infra changes or big changes to CI scripts).

Using triggers

I came up with the idea that we could use triggers to record which tables changed during any given test, and then on the run of the next test, only reset the changed tables. In most cases only one or two tables change so resetting just two tables is significantly faster than re-creating the entire database.

Here’s what I came up with:

  • The initial database is created and seeded with fixtures as usual
  • Create a copy of all tables to act as “seed tables” - they contain the pristine seed data
  • Create a trigger on each table that records when any INSERT/UPDATE/DELETE is performed
  • Tests run as usual
  • Between tests, interrogate the record of changes to find which tables were changed, and re-seed those tables from the copies.

In Practice

I created a Laravel command to “install test helpers”. This reads the schema and creates the necessary triggers on each table, then creates the necessary copy, and then also creates a special stored procedure to handle resetting the tables after each test.

Here’s what it looks like:

<?php

namespace App\Console\Commands\Test;

use Illuminate\Console\Command;
use Illuminate\Support\Facades\DB;
use Illuminate\Support\Str;

class InstallHelpersCommand extends Command
{
    protected $signature = '
        test:install-helpers
    ';

    public function handle()
    {
        $this->getOutput()->write('Create _test_tbl_changed ... ');
        DB::unprepared('DROP TABLE IF EXISTS `_test_tbl_changed`');
        DB::unprepared('
            CREATE TABLE `_test_tbl_changed` (
              `id` bigint NOT NULL AUTO_INCREMENT,
              `tbl` varchar(255) NOT NULL,
              PRIMARY KEY (`id`)
            ) ENGINE=MEMORY DEFAULT CHARSET=utf8mb4
        ');
        $this->getOutput()->writeln('OK');

        $this->getOutput()->write('Create _testRefillFromSeed procedure ... ');
        DB::unprepared("DROP PROCEDURE IF EXISTS _testRefillFromSeed");
        DB::unprepared("
            CREATE PROCEDURE _testRefillFromSeed()
            BEGIN
                DECLARE done INT DEFAULT 0;
                DECLARE numReset INT DEFAULT 0;
                DECLARE tblName VARCHAR(255) DEFAULT NULL;

                DECLARE cur CURSOR FOR
                    SELECT tbl FROM _test_tbl_changed GROUP BY tbl;
                DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

                SET @dp_is_refilling=1;

                OPEN cur;
                SET FOREIGN_KEY_CHECKS = 0;

                REPEAT
                    FETCH cur INTO tblName;

                    IF tblName IS NOT NULL THEN
                        SET @TruncateQuery = CONCAT(
                            'TRUNCATE ', tblName
                        );
                        PREPARE stmt FROM @TruncateQuery;
                        EXECUTE stmt;

                        SET @InsertQuery = CONCAT(
                            'INSERT ', tblName, ' ',
                            'SELECT * FROM `_test_seed_', tblName, '`'
                        );
                        PREPARE stmt FROM @InsertQuery;
                        EXECUTE stmt;

                        SET numReset = numReset + 1;
                    END IF;
                UNTIL done = 1 END REPEAT;
                CLOSE cur;

                SET FOREIGN_KEY_CHECKS = 1;
                TRUNCATE TABLE _test_tbl_changed;
                SET @dp_is_refilling=0;

                SELECT numReset AS 'Number of tables reset';
            END
        ");
        $this->getOutput()->writeln('OK');

        $tableNames = collect(DB::select("SHOW TABLES"))->map(fn ($r) => reset($r));

        $this->getOutput()->write('Creating change watch triggers ');
        foreach ($tableNames as $tbl) {
            if ($tbl === '_test_tbl_changed') {
                continue;
            }

            foreach (['INSERT', 'UPDATE', 'DELETE'] as $event) {
                $this->getOutput()->write('.');
                $id = md5("change_{$tbl}_{$event}");
                DB::statement("DROP TRIGGER IF EXISTS _test_{$id}");
                DB::statement("
                    CREATE TRIGGER _test_{$id}
                    AFTER {$event} ON `$tbl` FOR EACH ROW
                    BEGIN
                        IF @dp_is_refilling IS NULL OR @dp_is_refilling != 1 THEN
                            INSERT INTO _test_tbl_changed (tbl) VALUES ('$tbl');
                        END IF;
                    END
                ");
            }
        }
        $this->getOutput()->writeln(' OK');

        $this->getOutput()->write('Copying seed tables ');
        foreach ($tableNames as $tbl) {
            $newTbl = '_test_seed_' . $tbl;

            if (Str::startsWith($tbl, '_test_')) {
                continue;
            }

            $this->getOutput()->write('.');
            DB::statement("DROP TABLE IF EXISTS `$newTbl`");
            DB::statement("CREATE TABLE `$newTbl` LIKE `$tbl`");
            DB::unprepared("
                INSERT `$newTbl`
                SELECT * FROM `$tbl`
            ");
        }
        $this->getOutput()->writeln(' OK');
    }
}

All you need to do is have the test runner execute the special _testRefillFromSeed stored procedure after ever test. In our case, we had an env var that contained the reset command and it was just a matter of setting it:

DB_RESET_CMD="mysql test -N -B -e 'CALL _testRefillFromSeed();'"

And from the Jest side looks a little like

import { spawnSync } from 'child_process';
beforeAll(() => {
  const options = {
    shell: true,
    cwd: __dirname,
    env: {...process.env, ...env},
    stdio: [0, 1, 2]
  };
  
  const status = spawnSync(process.env.DB_RESET_CMD, null, options).status;
  if (status) {
    console.error('Failed to reset db with command: ' + process.env.DB_RESET_CMD);
    process.exit(status);
  }
});

Easy peasy. Hit all the goals: No changes to tests needed, no changes to infra, and everything is faster with less load on the CI server.