Platinum Partner
java,devops,theory,static analysis,tips and tricks,tools & methods,code comprehension

How Static Analysis Can Help Code Comprehension

You might expect most time of any developer's time to be taken up by writing or modifying code, fixing a bug or similar. However, the surprising truth is that developers spend most of their time trying to understand code.

Test your skill

Even a very small code snippet may take longer to understand than we might assume, and studying it may lead to a misconception. Complete understanding of a large body of code is practically impossible, even though developers inevitably make mistakes without it. To prove our statement, let's try to solve the following program comprehension task and measure the time needed to do so. The code may be tricky and unrealistic, but it demonstrates that even  very short code can be difficult to understand. Take your time to work out the correct value to be printed in the last line of the code below. The solution is provided in the final section.Our developer's results can also be found in the final section.

class A {

	B b = null;

	public void f1(B b){
		b.x = this.b.y;
	}	
}

class B {

	int x = 1;
	int y = 9;
}

public class Test {

	void example() {
		A a1 = new A();		
		a1.b = new B();
		a1.b.x = 3;		
		B b = new B();
		b.x = 29;		
		A a2 = new A();		
		a2.b = new B();			
		a2.b.x = 77;		
		b = a2.b;		
		a1.f1(b);
		a1.b = a2.b;
	
		System.out.println(a1.b.x); //the value of a1.b.x?
	}
...

Why is code comprehension so difficult?

Whether you are writing a new code part or modifying an old one, you will ideally need to know all other code that impacts or depends on the code you are writing. However, code that is new today is a little bit outdated tomorrow, and even older a week or a month later. When you modify a section of code to introduce new features or fixing bugs, you need to ensure that related entities in the code are updated to be consistent with these new changes. After the modification, the changed part may no longer fit with other code fragments. The reason for this is that the modified parts may no longer provide what the other code fragment requires, or that they may now require different interactions from the parts they depend on. On the other hand, when new code is inserted, the correct dependencies also need to be established for related code parts. This is not a simple task and requires deep knowledge of the related code parts.

The first problem is that we don't know what parts of the code are related. Researchers have proposed various approaches to tackle this problem by reducing the number of interconnections between program parts and preventing hidden dependencies. These goals are achieved in part by information hiding and encapsulation implemented in OO languages. By applying setters and getters, you can assign then  access the fields of an object by invoking special methods. However, the key problem remains: Whatever way data is propagated in the code, you need to know about it. You can use any methods or languages, data will always be propagated throughout the entire code.

This is self-evident, because software should always reflect the real process it attempts to simulate. If this is sophisticated and contains complex data flowsbetween the parts, the implemented code should do the same. Therefore, if you set a field and want to use it later on, you should validate that the data is properly assigned and exactly the way you need it. However, if the code is large enough, there may be severalways in which your assumptionscan fail:

  • a subsequent setting redefines the required value of that field
  • the invocation of the setter is different from the one expected
  • there is no control from the setter to the getter

There are many other cases when deep knowledge of the code is required about the code, for example when calling a function f(): result = a.b.f(q) we need to know:

  • the possible type(s) of a and b,
  • where objects a,b have been created
  • from where q has its original value
  • the value destination of variable result

We have seen that code comprehension is the key to both code development and bug fixing, and that it is not an easy task. But how much of total development time is taken up by code comprehension. Various assumptions and case studies exist, but there are no precise measurements to date. However, the results of all case studies produce estimates of between 47% and 78%. Of course, this ratio strongly depends on factors such as the programmer, the programming language, the development method and the projects to be implemented. In reality, I believe the average number to be around 50% - a big chunk of the day!

There are several methods to support code comprehension. Most IDEs have a search engine that is frequently used by developers. However, these relatively simple search methods are not enough to fully understand our code. To know  the type hierarchy, declarations, where the selected fields are assigned or accessed, the static call graph, etc., do not mean deep and sufficient comprehension of the code.

The reason for this is that deep knowledge of the code would necessitate information on program execution without actually executing the code. The tools that provide such information are known as static code analyzers. However, we often require the necessary information immediately after any code modification, i.e. just in time. The deeper the information a static analyzer provides, the more time is needed to produce it. A system-wide analysis of large code bases may take minutes or hours to complete analysis precise enough to produce a high percentage of the necessary results with few false positives. Less precise analysis may take seconds, but the necessary information may be weak or produce too much false information (false alarms). However, the nature of code comprehension requires instant and precise results.

4D Ariadne for code comprehension

The Ariadne code tracker is a unique tool that computes just-in-time (JIT) information to facilitate the understanding of code. By applying it, developers are able to track code forward and backward along all possible dependent code parts. It becomes possible to follow the chain of dependent values or statements from their origin (where the first value has been assigned) to the final destination (where the last dependent value has been read) and vice versa. It pinpoints the path containing method call, entry and exit points from value assignment to value access. This is similar to a stack trace when the code is executed.

 Ariadne also displays the precise calling context among the methods in the code, i.e. the dynamic call graph containing only those method calls that would be executed if the code was run. The entire structure of dependent code parts can also be viewed as graphs in different granularity, i.e. at statement, function and class level. 4D Ariadne also provides metrics on the cost required to modify or fix a method or part of the method. These metrics are calculated based on the entire existing source code, while other methods contain ad-hoc elements and heuristics. More information on Ariadne can be found at: http://4dsoft.eu/solutions/4d-ariadne

Compare your result with our team's

Here is the result of our small case study. The correct value is 9, let's check it!

developer

time to find solution

(min:sec)

   result

d1

 3:05

failed

d2

 1:24

failed

d3

 2:39

passed

d4

 1:58

failed

d5

 4:21

failed

d6

 3:22

passed

d7

 2:42

failed

d8

 2:40

passed

d9

 0:57

failed

d10

 1:54

passed

d11

 4:13

passed

d12

 2:41

failed

Average time

 2:40

average time for solvers

 2:58

success rate

 5/12

41.7%

The conclusion we can draw is that code comprehension is extremely difficult. The majority of our experienced Java developers failed to solve the problem. Solvers required around three minutes to completely understand this short piece of code.

And finally, how long does it take to solve the problem using Ariadne? Naturally, it depends on the level of experience of Ariadne users. An experienced user will find the correct value of 9 in around10 second. To get a first impression of using Ariadne, take a look at our short video demonstration of how the tool would tackle the same problem.


Resources:

Download Ariadne

Visit Ariadne site

Video


{{ tag }}, {{tag}},

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}
{{ parent.authors[0].realName || parent.author}}

{{ parent.authors[0].tagline || parent.tagline }}

{{ parent.views }} ViewsClicks
Tweet

{{parent.nComments}}