When I joined Shutterstock, I was impressed by the amount of automated test coverage the company had. Almost every piece of functionality on the site had test coverage in the form of Selenium end-to-end tests. Shutterstock had a development workflow in place through Jenkins that would block a deployment to production if the Selenium tests failed. I liked that; it meant no one could release anything into production unless all the tests passed.
But soon after, I realized that our company, which was releasing multiple times a day, had turned into a company that was now blocked from releasing for multiple days at a time, mainly because of failing Selenium tests. Most often, the tests failed not because of a broken product, but because they were fragile.
At our core, we had a QA organization that had not scaled with the rest of the organization. While they had the skills to automate everything, they lacked the core skills necessary to build a scalable test framework. Because of this gap, they were unable to influence the rest of the organization to think of quality as something that was owned by all, rather than just the QA team. To close this gap, we had to rethink our approach to QA as a whole.
I wanted to accomplish two goals: First, to rebuild Shutterstock’s test infrastructure/frameworks to be more stable, and second, to change the engineering culture at Shutterstock to be one where quality was not just owned by the Test Engineering team, but by everyone.
We knew our largest problem was fragile tests, so we built a tool called Sagacity to record each test’s pass/fail data. We had all our tests push data into Sagacity each time they ran as a part of our Jenkins workflow. We then built a website on top of this database to make it easy to mine the data. We were now able to monitor pass rates for jobs, pass rates for individual tests, most commonly occurring failure messages, longest running tests, and more. Armed with this data, we could hold ourselves, and others, more accountable. One of our core teams impacted most by failing tests realized that their usual pass rate was just 20%. (Imagine how often the software factory came to a halt because of this roadblock.) Using Sagacity, they were able to quickly isolate the tests that had the lowest pass rate and see the common failure message in these tests. The team made simple fixes to the test script to improve its reliability.
The launch of Sagacity, coupled with the right set of test engineers championing our new testing culture across their teams, led to an almost immediate uptick in the weekly pass rates for our automation jobs — from 20% to 80% in some cases.
Sagacity allowed for a lot of quick wins because it gave developers access to actionable data on their test suite. But we still had one lingering problem: We were too dependent on a large number of end-to-end selenium-based UI tests as the only quality gate to release. We had a larger problem to solve, and that was how to build quality checks further upstream into the development process.
At Shutterstock, for an end-to-end UI test job to run and then give a result, a developer needs to make a code change, then merge it back to a branch that would kick off a build and deploy followed by acceptance tests. Given the large number of tests, the automation test run could take anywhere between 30 and 60 minutes before the developer gets any sort of signal back on whether he or she has inadvertently introduced a bug into the system. Multiply this by 100+ developers at Shutterstock and you realize we were spending a lot of time waiting for tests to run. And, if they failed, you would have to go back and repeat the whole process.
We knew giving developers more immediate feedback was a key step in speeding up our software factory. To do that, we needed to build quality into each step of the development workflow rather than as a step at the end. To accomplish that, we did the following:
We integrated SonarQube with our development process so that each build would push unit test coverage data onto SonarQube. SonarQube helped us get a dashboard of unit test coverage across all the repos at Shutterstock. We put this dashboard up on the monitors in different areas to make everyone aware of their team’s unit test coverage. We saw teams that had less than 10% coverage respond quickly to bring their coverage numbers up so that they could compete with other teams.
We introduced the concept of mid-level tests, which ran on a sandboxed version of our service/application with all the external dependencies mocked out. We were able to run these tests without requiring a deploy using technologies such as Docker. In addition, we could run the same selenium-based UI tests on the sandboxed version of the app, but in 1/10th the time. As such, developers could kick off mid-level tests and get pass/fail results back almost immediately.
We used Drone and Docker to build a number of quality checks directly into every pull request. As soon as a pull request was created, the developer and the reviewer got immediate feedback on code coverage numbers, results of mid-level tests, and results of unit tests. We armed our developers with data about the quality of their code before any of their code would be merged.
We reintroduced people across the engineering organization to the test pyramid. As a part of each feature, we embraced a healthy discussion about how it should be tested, where the right set of tests should live, and anything else that came to mind.
No matter how your company structures cadences for code releases, you need to emphasize quality in every step your development process. Companies spend large amounts of time working out how their software will be architected, but not enough time thinking about how quality will be built into the software; instead, they blindly automate everything. Your development workflow and how you build quality into each step of it is crucial in determining how fast you can iterate on your product.
At Shutterstock, the ability to release daily improvements to our product is a key differentiator between us and our competitors. We are always looking to improve our approach to software releases. By asking ourselves some difficult questions and assessing our behavior, we improved our workflow and built an engineering culture where everyone owns quality. Since making this change, we have done more than simply speed things up: We’ve seen firsthand how new ideas and innovations can radically change company culture for the better.