At Even, we run lots (~4200) of tests every time an engineer makes a pull request. At first, it took a bunch of complicated support code and lots of time to run all of those tests. But since I joined in October 2018, we have totally revamped the way we run tests. This post tells the story of how we made testing faster and easier on our backend services. If you're interested in learning more, or if you want to help us improve testing for our React Native mobile apps, we're hiring.
Testing at Even is as simple as:
bazel test //...
But that is not where we started and that command is deceptively complex.
Our backend code is largely made up of the following languages and components:
This is a simplified list of the most complex parts of our system.
We have multiple services that use gRPC for communication. To mimic our production system, we use docker-compose to stitch together containers and support services like PostgreSQL during local testing and development.
Our "unit tests" could be considered integration tests as most of them test code end-to-end (e.g. reading/writing to PostgreSQL). Furthermore, we rarely mock out any interaction with external services including AWS. When I started at Even, some of our tests relied on things like S3 buckets in our production account!
Developers (and our CI system) ran tests using a script that basically called docker-compose run <service>.test.
That test service was a container defined in docker-compose.yml
that existed for each component in our backend. Each service had to duplicate the same basic setup:
go test <packages>
or tox
(for Python).This seems simple enough but came with a lot of problems and hacks:
In addition to the testing issues listed above, we had no way to guarantee consistent build tools between developers and CI.
Bazel is a fantastic build tool for monorepos. This post isn't really about why we chose Bazel or what Bazel can do. We wanted reproducible builds and tighter control over our tooling so bazel was a good fit.
The first step towards bazel test //...
was to get bazel build //...
working. Luckily, there is a great tool called gazelle that integrates with bazel to produce BUILD.bazel
files for Go projects. We heavily rely on this to update our BUILD.bazel
files automatically.
Since we wanted tight control over tooling, bazel
is actually a script in our repository that ensures that every developer and our CI system gets the exact same version of the "real" bazel
. The script automatically installs the desired version and handles some other complex setup (more on that later).
bazel test //...
almost worked out of the box. We ended up hitting the serial/parallel issue mentioned above. bazel
wants to run as many actions as possible in parallel but we required packages with serial tests to run in isolation. My first approach was to run parallel tests first and then all serial tests using -j 1
which tells bazel
to run a single job at a time. The challenge with this approach was that bazel
expects tests to be run with the same arguments or environment. If those arguments or environment change, the cache is invalidated.
For example, suppose we have a test like so:
func TestComplex(t *testing.T) {
if os.Getenv("COMPLEX_ENABLED") == "true" {
Complex()
}
}
Running bazel test //...
will skip this test (COMPLEX_ENABLED
is empty). Running bazel test //... --test_env COMPLEX_ENABLED=true
will run the test again since the environment changed. This is good for running that complex test but it also invalidated the cache for the first run. So running bazel test //...
will run the original test again even though the code did not change. So having a script like the following:
bazel test --parallel //...
bazel test --serial //...
...will always run the tests even if no code changes. So I decided to remove the package isolation requirement.
In order to have bazel
test everything in parallel, we needed to remove the shared resource of PostgreSQL. Our serial tests were marked that way because they a) truncated a table or b) relied on a pre-existing database state. But since we couldn't give every service a dedicated database container, we decided to use PostgreSQL templates to isolate each test within the same database.
CREATE DATABASE foo WITH TEMPLATE bar
is a PostgreSQL command that will create a database named foo
that is identical to the database bar
. Our migration step runs all migrations on our core databases and then makes a copy into a database named <database>_template
. Each package that interacts with the database automatically runs CREATE DATABASE source_my_package WITH TEMPLATE source_template
and reconnects to the new database. This isolates the package to a copy of the database it needs so that it does not conflict with other tests.
Our tests interact with other services that need to talk to the database as well. Our gRPC
layer will forward the new database name to remote services (running under docker) so the remote service will use the same copy.
bazel
is a powerful tool. But that power comes at the cost of complexity, especially for the engineers on our team that had never used Bazel before. That’s why we integrated it into the test scripts our developers already knew.
In the old system, all developers needed to do was run a single command run-tests foo
that took care of docker-compose
, database migrations, and testing the targeted package. bazel
is not a tool for managing long running applications like docker. I mentioned earlier in the post that our bazel
is a script that wraps the real bazel
binary. This script does a lot to ease the transition for developers to a new command:
docker-compose
setupbazel
via --test_env
argumentsgazelle
./...
-> //...
)When a developer runs bazel test //...
, they will first see a run of gazelle to automatically update BUILD.bazel
files. Next, bazel
will run a dedicated docker-compose
service that ensures all dependencies are up and available. Then, bazel
runs all migrations. Finally, bazel
tests the desired packages.
We added flags to bazel
to control all the above behaviors in order to iterate much faster. And we now have a command bazel watch
which runs ibazel
under the hood to automatically test the desired packages when changes are made. Using bazel watch
is very similar to automatic build+test in an IDE.
I mentioned Python earlier as one of our development languages. bazel
's Python support is not as advanced as other languages but we still wanted bazel test //...
to work with our Python code.
sh_test
is a bazel
rule that runs a shell script as a test target. We do not test our Python code directly with bazel
because managing those dependencies within bazel
is incredibly complex and requires native libraries which are difficult to manage across developer machines and CI. Instead, we continue to use the docker
container for running tox
. Our sh_test
includes all dependencies (Dockerfile, docker-compose.yml
, etc) and Python sources as the data
argument, so bazel
only needs to run these tests when one of those files changes. The script invokes docker-compose
to run tox
within the container. The nice part about this is that we have tightened up the dependency set so that these tests are easy to cache.
The last big component of our testing infrastructure is localstack. We use localstack to mock out AWS services required by tests. Localstack does a great job of duplicating the behavior of most AWS services. This allows us to write tests that utilize the same code path as production — we just point the test to our localstack instance.
At this point we are heavily invested in bazel
. We use it for linting, docker
build and push, code generation, and infrastructure. We have made incredible progress but we have a lot more to do. We still have several docker
images that exist outside of bazel
and our Python code is not properly integrated. Our client codebase has just gotten started with bazel
and has an entirely unique set of challenges. If you want to help us do more with Bazel across our entire codebase, we’re hiring.
Patrick is a Staff Software Engineer working remotely from North Carolina. He’s always looking for ways to make the right thing also be the easy thing, especially when it comes to the tools and internals in our codebase.
Get updates around new research and findings in your email.