Making Database Testing More Reproducible With Testcontainers
Making Database Testing More Reproducible With Testcontainers
Meet Richard North, the creator of Testcontainers, a promising new database and User Interface testing tool for reproducible tests.
Join the DZone community and get the full member experience.Join For Free
Built by the engineers behind Netezza and the technology behind Amazon Redshift, AnzoGraph™ is a native, Massively Parallel Processing (MPP) distributed Graph OLAP (GOLAP) database that executes queries more than 100x faster than other vendors.
In this series, we interview someone we find exciting in our industry from a jOOQ perspective. This includes people who work with SQL, Java, open source, and a variety of other related topics.
Q: You work at Skyscanner, which means you have tons of travel data to work with. What’s the most exciting thing about working with this data?
A: Skyscanner is a really data-led company, using data at all levels in decision-making. The volume of data that we gather and process really helps us understand what travelers want and allows us to serve them better. For example, our destination recommender system helps people discover new and interesting places to go based on a vast amount of data, algorithms, and experiments. But it’s interesting how this varies from recommendations in internet media companies. There are far fewer possible destinations than books, songs, and movies, yet users’ reasons for traveling and tastes can be more nuanced and varied.
There’s yet more data-oriented work under the surface, too. For example, the infrastructure needed to gather and analyze such large amounts of data, and the running of experiments to help us improve. There are a lot of smart people working hard to make this all happen, and it’s an exciting place to be!
Q: You’ve created an increasingly popular testing framework, Testcontainers. What made you do it? What itch does it scratch?
A: Well, I think it’s something that scratches several itches at the same time — things that previously we only had isolated solutions to. The common element, though, is reproducibility of test environments. Of all my time developing and writing tests for JVM-based systems, it’s always been the non-JVM dependencies that caused the most complexity, unreliability, and maintenance overhead.
I remember my first day as a developer, years ago. I was given a desktop machine and two days’ worth of step-by-step instructions that I needed to follow –just so that I’d be able to develop and run tests with all dependencies in place. A few months later, I had to repeat the same task many times over when building new CI servers.
A lot has changed since then in terms of how we deploy and manage our production infrastructure, and thankfully, Docker has done a lot to further bring prod-parity to developers’ machines.
Testcontainers started out as my effort to bring the full power of Docker to integrated testing on the JVM in two areas that I’ve experienced the most pain: testing against a clean, representative database, and making browser-based selenium testing more reproducible, both for developers and on CI.
Q: Mostly being curious about testing databases, your documentation mentions Testcontainers as an alternative to using H2 as a test database. What are the disadvantages of emulating a database, like with H2? Did you make any personal experience with that?
Yes, definitely. It was one of the tipping point factors that triggered me to create Testcontainers. I do think H2 is a fantastic piece of work in what it manages to deliver, and it’s something I’ve used on a number of projects to good effect.
However, compatibility with real databases has often been a sticking point. Back in 2015, before I started Testcontainers, we were struggling with a few MySQL features that didn’t have equivalents in H2. We were facing the unpleasant prospect of having to constrain our implementation to what H2 would allow us to test against. It became fairly obvious that there was a gap in the market for an H2-like tool that was actually a facade to a Docker-based database container — and Testcontainers was born.
Q: How do you think of mocking the database at any layer, including the DAO layer, service layer, etc.?
A: I’m all in favor of keeping tests small, light, and layered, and using mocks to accomplish this. This might sound strange coming from somebody who has developed an integrated testing tool, but it’s true!
Still, I feel that we need to be pragmatic about how we approach automated tests and how we make sure we’re testing the right thing — especially when crossing boundaries. Are we testing how this code behaves against reality or are we testing against our own (potentially false) understanding of how external components work?
My feeling is that it’s quite straightforward to mock layers of your system that you yourself wrote, or where you can easily jump into the source code, a spec, or documentation. With an external component, you can still produce a mock that behaves how you expect or how you witness the real thing behaving. But does that mock continue to represent the real thing, especially after accretion of other features, or the additional perils of the state that a database entails — schema changes and actual data?
My ideal is to mock the data access layer for consumption by higher layers but to be quite careful about what the data access layer itself talks to in my tests. It should probably be a real database. Hopefully, Testcontainers is one tool that helps make this particular thing a little less painful so that when you find yourself needing to do this, there’s a way to do it easily.
Q: What’s the biggest challenge you’ve faced when testing databases, or other things?
A: It’s not databases, but I’d say that by far, the hardest testing challenge I’ve faced as a developer is mobile apps — especially iOS. I’ve always enjoyed mobile development as a whole, but when switching from a Java server-side/web project to mobile, it really feels like you’re going back in time. Some of the challenges are harder, such as asynchronicity and platform APIs that make it harder to structure software in a testable way. But it also feels like the tooling is much further behind, and until quite recently received far less attention. I feel the net result has been that developers have been discouraged from investing in automated tests, which is sad given that we know how valuable they can be.
Things are getting better, but I do greatly prefer the testing aspects of working on server-side JVM projects. For all its difficulties, we are actually quite lucky!
Published at DZone with permission of Lukas Eder , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.