Unit Testing Large Codebases: Principles, Practices, and C++ Examples
This guide shares best practices for writing scalable unit tests in large C++ codebases, covering dependency injection, Google Test, and Google Mock.
Join the DZone community and get the full member experience.
Join For FreeUnit tests are often overlooked in the software development process but there are a lot of nice side effects of writing unit tests. After writing production software code for more than a decade, which has served billions of users for planet scale applications, I can confidently say that unit tests hold a critical place in software development lifecycle.
Despite the importance of unit tests, many engineers often overlook them due to timeline constraints or their over reliance on manual testing. There is also a misconception that unit tests slow down software development which is not necessarily true. As a matter of fact, study reveals that test driven development (TDD) may have a positive impact on software development productivity. In the long run, unit tests make iterating on code easier and faster.
Based on my experience of writing code for more than a decade, unit tests also have other secondary benefits apart from developer productivity.
-
Unit tests reduce code review time by making the code easier to understand and giving reviewers confidence through clear, reliable test coverage.
-
Unit tests lower onboarding time for new engineers on a codebase as unit tests force code to have better structure and increase modularity for codebase.
-
Unit tests give the ability to refactor code confidently as changes can be validated quickly by running unit tests.
-
Unit tests improve reliability for the system as a lot of edge cases can be covered by codebase which has high unit test coverage. These tests help in catching hard to debug production issues at the development stage.
-
Unit tests improve the structure for the code as you have to follow principles like dependency injection to write unit tests effectively. If a code is well structured, you can leverage LLM based tools like github copilot to write unit tests for your code which can generate test cases for you.
What Is a Good Unit Test?
It is important to understand some of the properties for a good unit test and avoid bad practices while writing unit tests.
A good unit test has the following properties:
Is Deterministic
A good unit test is deterministic—if you run a test multiple times, it always produces a deterministic result. Running a test multiple times (stress testing it) is one of the ways to validate that a unit test is deterministic. To understand why unit tests can be non-deterministic, we can look at some of the anti patterns and bad practices while writing a unit test:
-
Bad Practice 1: Unit tests assume dependency on an external environment and when the environment changes, unit tests become flaky. An example of this would be to assume dependency on a certain command line parameter, a certain config value being set or expecting a certain response from an RPC call in the test.
-
Solution: Unit tests should be self contained and should set up its dependencies internally or as a part of the test suite framework. It should not make any assumptions about external dependencies that it doesn’t set up.
-
-
Bad Practice 2: Unit tests assume certain things about code or hardware environments that are non deterministic. Unit tests should be environment agnostic. Few examples of these are:
-
Unit test assumes that a certain complex function will take more than 100ms to execute and validates that time taken to complete is always greater than 100ms. This is bad to assume and is environment dependent, so unit tests should be designed to not make these assumptions.
-
Unit test assumes that
int
is 32 but this is actually not hardware agnostic in C++ unless you use fixed width integer types likeint32_t, int64_t
.
-
-
Bad Practice 3: Unit tests make certain assumptions about the output returned from library functions while the output can change. Developers should read specs for the library functions when making such assumptions. Few examples of these assumptions are:
-
Assuming that the order of inserts and iterations in map are the same. A standard hash map in CPP like
std::unordered_map
does not give this guarantee. If you insert values in order{1,2,3}
, while iterating the map the value order may be{1,2,3}
or{2,1,3}
or something else. If you assume that the order is the same as the inserts, the test may be flaky. -
Assuming that the json serialization string for a certain dynamic object in C++ is always going to be the same. Folly C++ library supports dynamic objects and has an option to serialize them to JSON. However if you run a serialization function like
folly::Json(obj)
, it is not guaranteed to return the same serialized string every time. Your unit test may fail as the underlying order of keys in JSON is not guaranteed. -
How to stress test your unit tests for flakiness to make sure they are deterministic?
-
A well written unit test should be able to run in parallel multiple times and should pass deterministically. For unit tests written using Google Test Framework (GTest), you can leverage a library like gtest-parallel which allows running tests in parallel to detect flakiness.
-
Example
-
$ ./gtest-parallel out/{binary1,binary2,binary3} --repeat=1000 --workers=128
-
-
Has Single Purpose
-
A good unit test is written with a single purpose and tests only one test case per test. The name of the unit test should indicate this.
-
When a unit test fails, it should be easy to tell from the name of the unit test what part of the underlying code is failing.
-
This makes unit tests readable and maintainable in the long run and also keeps them smaller.
Avoids Duplicate Setup
-
There is nothing special about unit tests here in particular and this should be a general good practice to not have duplicate code.
-
However, it is pretty common for unit tests to copy paste code and have duplicate code across test cases with minor changes. This often is data preparation code or setup code for unit tests. This makes unit tests harder to maintain and the recommendations to avoid this are:
-
Extract out logic needed to set up the environment for unit tests in separate util functions.
-
Extract out logic needed to assert the results if similar validation is used in multiple tests in util functions.
-
Is Independent/Isolated
-
Unit tests should be independent and isolated. There should be no dependency between two tests.
-
The point above regarding avoiding duplicate setup also helps in keeping tests independent as setup of one test shouldn’t affect other tests.
Structure of a Good Unit Test - An Example
A good way to structure a unit test is to remember 3As (A-A-A), Arrange-Act-Assert. These are described below:
Arrange:
-
This is the setup step for the unit test. You need to understand what settings and objects your tests need and set that up before calling code that you want to test.
Act(action):
-
This refers to the main part of the function you are testing in your code. A good unit test should generally have only one act per test case. An example of "act" is calling a function that reads input from the config and multiplies that to the other parameter.
Assert:
-
This validates that action was indeed performed correctly. A unit test that doesn’t assert anything is an incomplete unit test. It might give code coverage but is not very meaningful. In general, avoiding multiple assertions with different goals in unit tests is a good idea.
Example of A-A-A for a Unit Test
Godbolt link: Compiler Explorer
#include<iostream>
class ConfigIf{
public:
virtual int getValue()=0;
};
class Config: public ConfigIf{
public:
int getValue() override {
return 7;
}
};
class MockConfig:public ConfigIf{
public:
int getValue() override {
return val_;
}
void overrideVal(int val){
val_ = val;
}
private:
int val_ = 5;
};
int multiplyBasedOnConfig(int num, ConfigIf& config) {
return num * config.getValue();
}
void testMultiplyBasedOnConfig()
{
// Arrange: Setup a mock config object to a specific value
MockConfig mockConfig;
mockConfig.overrideVal(10);
int expectedAnswer = 50;
// Act: Here, we call multiplyBasedOnConfig to get a value
int answer = multiplyBasedOnConfig(5, mockConfig);
// Assert: Test that answer is same as expected answer
// We should be using GMock framework to actually assert but we will cover that later in the guide.
if(answer == expectedAnswer) {
std::cout << "Test passed";
}
else{
std::cout << "Test failed";
}
}
int main(){
testMultiplyBasedOnConfig();
}
In the above code:
-
The example focuses on arrange-act-assert pattern and sees how we set up a config object to control the output of the function and then asserts on that.
-
The code written here uses dependency injection principle where
ConfigIf
defines the interface for the config which is a dependency for functiontestMultiplyBasedOnConfig
. Injecting dependencies via interface allows mock dependencies for a unit test. -
The goal of the above example is to explain how mocking works and intentionally doesn’t use a framework like GMock and relies on the std C++ library. In the next section, we will rewrite the above test using GMock and GTest framework so we can see how using a test framework can help.
Unit Test Using Google Test and Google Mock Framework
In general, using a unit test framework like google test and google mock makes writing unit tests easier by providing assertion tools, test case management and easy mocking for dependencies which makes tests efficient. In a nutshell, you do not need a test framework to write unit tests but it makes it a lot easier. We have written the unit test in the above section without a test framework but now we will rewrite that test using the google test framework.
Example of Above Test Using Gmock Framework
Godbolt link: Compiler Explorer
// The library used for google mock
#include <gmock/gmock.h>
#include <iostream>
class ConfigIf{
public:
virtual int getValue()=0;
};
class MockConfig : public ConfigIf {
public:
// MOCK_METHOD allows to generate mock method and makes mocking easier
// GMock framework provides easier mocking options
MOCK_METHOD(int, getValue, (), (override));
};
int multiplyBasedOnConfig(int num, ConfigIf& config) {
return num * config.getValue();
}
TEST(ExampleTest, multiplyBasedOnConfigTest) {
// Arrange: Setup a mock config object to a specific value
MockConfig mockConfig;
// EXPECT_CALL provided by GMock framework allows to
// update return value from mocked methods
EXPECT_CALL(mockConfig, getValue()).WillOnce(::testing::Return(7));
// Act: Here, we call multiplyBasedOnConfig to get a value
auto value = multiplyBasedOnConfig(5, mockConfig);
// Assert: Test that answer is same as expected answer
ASSERT_EQ(value, 35);
}
int main(int argc, char** argv) {
// The following line must be executed to initialize Google Mock
// (and Google Test) before running the tests.
::testing::InitGoogleMock(&argc, argv);
return RUN_ALL_TESTS();
}
In the above code:
-
We included the
gmock library : gmock/gmock.h
, which provides macros likeMOCK_METHOD
andEXPECT_CALL
which makes it easier to write unit tests. -
We rewrote the test using google test framework. It is very easy to write mock methods and unit tests if the underlying code is well written using dependency injection.
-
The unit test code itself follows A-A-A principle but leverages google mock framework and uses
EXPECT_CALL
to control the return value from the mock method. -
The main function just provides necessary code to initialize Gmock and run the test method.
Conclusion
We can recap the learnings and conclude that unit tests make code modular, helps in reliability and developer productivity and helps in adhering to good design principles like dependency injection for the code. Moreover, using testing and mocking frameworks like google test and google mock make it easier to write and manage unit tests.
Opinions expressed by DZone contributors are their own.
Comments