CircleCI and Rainforest QA on Continuous Deployment
Advice on performance testing, branching, and metrics from two experts in continuous deployment.
Join the DZone community and get the full member experience.Join For Free
If you tuned in for our “Getting to Continuous Deployment” Webinar, you already know that we had more questions than we had time to answer! Fortunately, both Kevin Bell of CircleCI and Fred Stevens-Smith of our own Rainforest QA had time to sit down and answer a few more questions that our webinar viewers sent in.
Read on to find out more of what Kevin and Fred have to say about how organizations can get to continuous deployment, and what advice they have on everything from integration testing to QA metrics.
Not sure if you are going to talk about it, but the most challenging thing about CD is database schema changes. Any insight?
Kevin: Yes! Rainforest QA actually did a great blog post about exactly this. This is a good example of an area where doing CD does involve some overhead of engineering effort, but it is worth the overall productivity gain of constant deployment.
What tools do you recommend for benchmark testing?
Kevin: I’m assuming this means performance testing. I’m afraid this will vary greatly from case to case. In some cases,
time myprogram > /dev/null might be enough, but in others you may need a HTTP API client/test harness to bang on some web service. I would emphasize controlling as many variables as possible and keeping the benchmark realistic.
For example, at CircleCI we run new build container images against a bunch of open source repos before deploying them. I would also definitely recommend production performance monitoring, as unexpected things will happen in production.
How often do you recommend running integration test suites?
Fred: As often as possible. Ideally every commit. If this is unrealistic then at least on every pull request. The more often the better.
How long should an integration test suite run? (Obviously as quickly as possible, but how fast are yours?)
Fred: Our entire integration suite (Rainforest + cucumber) takes about 25 minutes.
Do you have a full environment with all microservices for your integration tests?
Fred: Yes. We don’t use microservices, but we do use external APIs (payment processing etc), which I assume are equivalent for the purpose of your question. We mock the external APIs for our unit tests, and use sandboxed versions where possible.
For Kevin: how does CircleCI perform its functional integration tests to fit in with its CI/CD process? Manually? Automated? Ad-hoc?
Kevin: CircleCI provides an environment for you to run whatever tests you want. There are a lot of added niceties–for example a lot of common tools are pre-installed and there are special directories to store dependency caches, build artifacts, and test output for later access–but it is ultimately just a place to run your own tests. In practice these are usually unit tests (run with a tool like junit, nosetests, rspec, etc) or browser tests (using something like selenium webdriver, cucumber, capybara, etc).
CD Branching Strategy
Do you think there is a particular Git repository/branching strategy that is best suited for CD?
Fred: Yes absolutely. Feature branch -> Develop (backed by staging) -> Master (backed by production). Movement from one branch to another requires a pull request.
What do you recommend for mobile monitoring? We use New Relic but our mobile developers hate it.
Kevin: I’m not personally aware of mobile monitoring tools other than New Relic and AppDynamics. Depending on your needs, you might also look into more beta-testing related tools like HockeyApp, TestFlight, Fabric, or TestFairy, though I don’t think those offer any production monitoring.
I am looking to implement TDD and CI and increase testing in the development environment. With all this said, are there best practices to show how we moved the needle? Or what types of metrics do we capture to show that? (E.g. defection detection times)
Kevin: This is another question that is hard to answer in broad terms, but here are just a handful of thoughts:
- Some measure of velocity (either scrum-style points/sprint, number of features/week, etc)
- Defect rate (e.g. number of bug reports/month)
- Frequency of deployments: This metric has limits–it probably isn’t much better to deploy once per minute than once every 10 minutes. However, deploying once/day or more is a good sign that the team can ship easily and quickly and respond promptly to unexpected issues.
- Risk/stability of deployments (e.g. how often do deployments cause downtime or generate urgent issues)
I’ll also add a note about one of the most popular and misunderstood test-related metrics: code coverage. Code coverage is definitely a useful tool, but remember the best it can do is give you bad news. That is “50% coverage” means there is definitely a lot of code that that isn’t tested, but “95% coverage” does not mean “everything’s fine, all cases covered!” It is entirely possible to have 100% coverage and many possible program states untested.
How do you measure feature quality / bug incidence in your team?
Fred: First a caveat: don’t read into this too much. Lots of bugs found by testing can mean you’re testing more rigorously or your app is more buggy or both. So penalizing for bugs is likely to incentivize bad behavior. It’s actually great you find bugs! Same goes for bugs in production. As your product gets more mature and customers use it more and more, the bugs within unexplored areas of your product start appearing. Again, this isn’t anyones ‘fault’.
What can be useful though is measuring test failures over time, especially broken down by browser, OS etc. This gives you nice insight into quality trends which are more meaningful than just bugs. Also, monitoring exceptions in production is very helpful to understand which parts of your app are unstable and have trouble.
QA Strategies & Technical Debt
How can you control inherited bugs in upcoming versions?
Kevin: If you mean “how do I make sure known bugs don’t come back?” then I’d say to make sure to create test cases for those bugs so that developers learn quickly if they come back.
If you mean “how do I prioritize fixes bugs vs developing new features?”, then I’m afraid that’s an age-old product prioritization question that has to be handled on a case-by-case basis.
How do you handle cases where the QA tests are blocking other work because of the amount of time it is taking to resolve an issue?
Fred: If we’re talking mission-critical bugs in backend code in production then your entire applicable dev team should be focused on fixing that. However, your other teams should continue working on new stuff, and assuming that you’re doing CD this should not be a problem.
How do you recommend catching up on missing tests as a result of tests not being an important part of the first few months of the company, when these features already work anyway, don’t often change, and there are business deadlines to meet unrelated to these features?
Fred: I’m assuming that you’re asking about the first few months of existence of the company. I think there’s an idealistic response and a pragmatic response. You already know what the idealistic response is, so I won’t bore you with that. We actually faced a pretty similar situation to this. In practice you’ll want to have broad integration / UI test coverage on your key flows so you can have confidence that you’re not breaking things as you release changes. Then over time what we did is backfill unit test coverage in areas of the app as we were refactoring.
What are some strategies for speeding up a several-thousand test test suite without spending exorbitant amounts of money on more parallelization and containers? :)
Kevin: This process is very similar to any performance optimization problem. First, sort all your tests by execution time and see if there’s any “low hanging fruit” among the slowest tests. Other time, there may be large numbers of tests all suffering from the same performance issue. For example, one CircleCI user had dozens of tests that all took 5 seconds because a misconfiguration caused DNS lookups to take 5 seconds.
Also, make sure that you’re using the right type of test for the job. For example, it shouldn’t take a black-box browser test to make sure that a form controller throws an error for a date in the future (that can be done with a unit test). Unfortunately there’s no one-size-fits-all solution to make tests faster. Ultimately you will have to dig in and see what’s taking time in your own test suite.
Did you miss the live webinar with Kevin and Fred? Don’t worry – you can download the on-demand recording and catch what you missed about getting to continuous deployment!
Published at DZone with permission of Ashley Dotterweich, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.