DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Building Enterprise-Ready Landing Zones: Beyond the Initial Setup
  • Cookies Revisited: A Networking Solution for Third-Party Cookies
  • It’s Not About Control — It’s About Collaboration Between Architecture and Security
  • Hybrid Cloud vs Multi-Cloud: Choosing the Right Strategy for AI Scalability and Security

Trending

  • Mastering Fluent Bit: Installing and Configuring Fluent Bit on Kubernetes (Part 3)
  • Optimize Deployment Pipelines for Speed, Security and Seamless Automation
  • Problems With Angular Migration
  • My LLM Journey as a Software Engineer Exploring a New Domain
  1. DZone
  2. Software Design and Architecture
  3. Security
  4. Can Static Analysis replace Code Reviews?

Can Static Analysis replace Code Reviews?

By 
Jim Bird user avatar
Jim Bird
·
Sep. 29, 14 · Interview
Likes (0)
Comment
Save
Tweet
Share
14.5K Views

Join the DZone community and get the full member experience.

Join For Free

In my last post, I explained how to do code reviews properly. I recommended taking advantage of static analysis tools like Findbugs, PMD, Klocwork or Fortify to check for common mistakes and bad code before passing the code on to a reviewer, to make the reviewer’s job easier and reviews more effective.

Some readers asked whether static analysis tools can be used instead of manual code reviews. Manual code reviews add delays and costs to development, while static analysis tools keep getting better, faster, and more accurate. So can you automate code reviews, in the same way that many teams automate functional testing? Do you need to do manual reviews too, or can you rely on technology to do the job for you?

Let’s start by understanding what static analysis bug checking tools are good at, and what they aren’t.

What static analysis tools can do – and what they can’t do

In this article, Paul Anderson at GrammaTech does a good job of explaining how static analysis bug finding works, the trade-offs between recall (finding all of the real problems), precision (minimizing false positives) and speed, and the practical limitations of using static analysis tools for finding bugs.

Static analysis tools are very good at catching certain kinds of mistakes, including memory corruption and buffer overflows (for C/C++), memory leaks, illegal and unsafe operations, null pointers, infinite loops, incomplete code, redundant code and dead code.

A static analysis tool knows if you are calling a library incorrectly (as long as it recognizes the function), if you are using the language incorrectly (things that a compiler could find but doesn’t) or inconsistently (indicating that the programmer may have misunderstood something).

And static analysis tools can identify code with maintainability problems, code that doesn't follow good practice or standards, is complex or badly structured and a good candidate for refactoring.

But these tools can’t tell you when you have got the requirements wrong, or when you have forgotten something or missed something important – because the tool doesn't know what the code is supposed to do. A tool can find common off-by-one mistakes and some endless loops, but it won’t catch application logic mistakes like sorting in descending order instead of ascending order, or dividing when you meant to multiply, referring to buyer when it should have been seller, or lessee instead of lessor. These are mistakes that aren't going to be caught in unit testing either, since the same person who wrote the code wrote the tests, and will make the same mistakes.

Tools can’t find missing functions or unimplemented features or checks that should have been made but weren't. They can’t find mistakes or holes in workflows. Or oversights in auditing or logging. Or debugging code left in by accident.

Static analysis tools may be able to find some backdoors or trapdoors – simple ones at least. And they might find some concurrency problems – deadlocks, races and mistakes or inconsistencies in locking. But they will miss a lot of them too.

Static analysis tools like Findbugs can do security checks for you: unsafe calls and operations, use of weak encryption algorithms and weak random numbers, using hard-coded passwords, and at least some cases of XSS, CSRF, and simple SQL injection. More advanced commercial tools that do inter-procedural and data flow analysis (looking at the sources, sinks and paths between) can find other bugs including injection problems that are difficult and time-consuming to trace by hand.

But a tool can’t tell you that you forgot to encrypt an important piece of data, or that you shouldn't be storing some data in the first place. It can’t find logic bugs in critical security features, if sensitive information could be leaked, when you got an access control check wrong, or if the code could fail open instead of closed.

And using one static analysis tool on its own to check code may not be enough. Evaluations of static analysis tools, such as NIST's SAMATE project (a series of comparative studies, where many tools are run against the same code), show almost no overlap between the problems found by different tools (outside of a few common areas like buffer errors) even when the tools are supposed to be doing the same kinds of checks. Which means that to get the most out of static analysis, you will need to run two or more tools against the same code (which is what SonarQube, for example, which integrates its own static analysis results with other tools, including popular free tools, does for you). If you’re paying for commercial tools, this could get very expensive fast.

Tools vs. Manual Reviews

Tools can find cases of bad coding or bad typing – but not bad thinking. These are problems that you will have to find through manual reviews.

A 2005 study Comparing Bug Finding Tools with Reviews and Tests used Open Source bug finding tools (including Findbugs and PMD) on 5 different code bases, comparing what the tools found to what was found through code reviews and functional testing. Static analysis tools found only a small subset of the bugs found in manual reviews, although the tools were more consistent – manual reviewers missed a few cases that the tools picked up.

Just like manual reviews, the tools found more problems with maintainability than real defects (this is partly because one of the tools evaluated – PMD – focuses on code structure and best practices). Testing (black box – including equivalence and boundary testing – and white box functional testing and unit testing) found fewer bugs than reviews. But different bugs. There was no overlap at all between bugs found in testing and the bugs found by the static analysis tools.

Finding problems that could happen - or do happen

Static analysis tools are good at finding problems that “could happen”, but not necessarily problems that “do happen”.

Researchers at Colorado State University ran static analysis tools against several releases of different Open Source projects, and compared what the tools found against the changes and fixes that developers actually made over a period of a few years – to see whether the tools could correctly predict the fixes that needed to be made and what code needed to be refactored.

The tools reported hundreds of problems in the code, but found very few of the serious problems that developers ended up fixing. One simple tool (Jlint) did not find anything that was actually fixed or cleaned up by developers. Of 112 serious bugs that were fixed in one project, only 3 were also found by static analysis tools. In another project, only 4 of 136 bugs that were actually reported and fixed were found by the tools. Many of the bugs that developers did fix were problems like null pointers and incorrect string operations – problems that static analysis tools should be good at catching, but didn’t.

The tools did a much better job of predicting what code should be refactored: developers ended up refactoring and cleaning up more than 70% of the code structure and code clarity issues that the tools reported (PMD, a free code checking tool, was especially good for this).

Ericsson evaluated different commercial static analysis tools against large, well-tested, mature applications. On one C application, a commercial tool found 40 defects – nothing that could cause a crash, but still problems that needed to be fixed. On another large C code base, 1% of the tool’s findings turned out to be bugs serious enough to fix. On the third project, they ran 2 commercial tools against an old version of a C system with known memory leaks. One tool found 32 bugs, another 16: only 3 of the bugs were found by both tools. Surprisingly, neither tool found the already known memory leaks – all of the bugs found were new ones. And on a Java system with known bugs they tried 3 different tools. None of the tools found any of the known bugs, but one of the tools found 19 new bugs that the team agreed to fix.

Ericsson’s experience is that static analysis tools find bugs that are extremely difficult to find otherwise. But it’s rare to find stop-the-world bugs – especially in production code – using static analysis.

This is backed up by another study on the use of static analysis (Findbugs) at Google and on the Sun JDK 1.6.0. Using the tool, engineers found a lot of bugs that were real, but not worth the cost of fixing: deliberate errors, masked errors, infeasible situations, code that was already doomed, errors in test code or logging code, errors in old code that was “going away soon” or other relatively unimportant cases. Only around 10% of medium and high priority correctness errors found by the tool were real bugs that absolutely needed to be fixed.

The Case for Security

So far we've mostly looked at static analysis checking for run-time correctness and general code quality, not security.

Although security builds on code quality – vulnerabilities are just bugs that hackers look for and exploit – checking code for correctness and clarity isn’t enough for a secure app. A lot of investment in static analysis technology over the past 5-10 years has been in finding security problems in code, such as common problems listed in OWASP’s Top 10 or the SANS/CWE Top 25 Most Dangerous Software Errors.

A couple of studies have looked at the effectiveness of static analysis tools compared to manual reviews in finding security vulnerabilities. The first study was on a large application that had 15 known security vulnerabilities found through a structured manual assessment done by security experts. Two different commercial static analysis tools were run across the code. The tools together found less than half of the known security bugs – only the simplest ones, the bugs that didn't require a deep understanding of the code or the design.

And of course the tools reported thousands of other issues that needed to be reviewed and qualified or thrown away as false positives. These other issues including some run-time correctness problems, null pointers and resource leaks, and code quality findings (dead code, unused variables), but no other real security vulnerabilities beyond those already found by the manual security review.

But this assumes that you have a security expert around to review the code. To find security vulnerabilities, a reviewer needs to understand the code (the language and the frameworks), and they also need to understand what kind of security problems to look for.

Another study shows how difficult this is. Thirty developers were hired to do independent security code reviews of a small web app (some security experts, others web developers). They were not allowed to use static analysis tools. The app had 6 known vulnerabilities. 20% of the reviewers did not find any of the known bugs. None of the reviewers found all of the known bugs, although several found a new XSS vulnerability that the researchers hadn’t known about. On average, 10 reviewers would have had only an 80% chance of finding all of the security bugs.

And, not Or

Static analysis tools are especially useful for developers working in unsafe languages like C/C++ (where there is a wide choice of tools to find common mistakes) or dynamically typed scripting languages like Javascript or PHP (where unfortunately the tools aren't that good), and for teams starting off learning a new language and framework. Using static analysis is (or should be) a requirement in highly regulated, safety critical environments like medical devices and avionics. And until more developers get more training and understand more about how to write secure software, we will all need to lean on static analysis (and dynamic analysis) security testing tools to catch vulnerabilities.

But static analysis isn't a substitute for code reviews. Yes, code reviews take extra time and add costs to development, even if you are smart about how you do them – and being smart includes running static analysis checks before you do reviews. If you want to move fast and write good, high-quality and secure code, you still have to do reviews.You can’t rely on static analysis alone.

security

Published at DZone with permission of Jim Bird, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Building Enterprise-Ready Landing Zones: Beyond the Initial Setup
  • Cookies Revisited: A Networking Solution for Third-Party Cookies
  • It’s Not About Control — It’s About Collaboration Between Architecture and Security
  • Hybrid Cloud vs Multi-Cloud: Choosing the Right Strategy for AI Scalability and Security

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!