How to Efficiently Resolve a Bug
Software is never bug-proof.
Join the DZone community and get the full member experience.
Join For FreeIn an ideal world, software engineers deliver software free of all kinds of bugs and defects to their clients. Unfortunately, such a world does not exist.
0-bug software is a myth. Bugs are very commons in today's software, and a great part of their lifecycle consists of resolving them. Why? Well, because the error is human. It is as simple as that. Surely, there are factors and reasons that increase the number of bugs in software, like lack of experience or the neglect of the best practices. However, it is not the subject of this article to list them. In this article, I am assuming that the harm is already done and I present methods on how to resolve it.
The Method
An efficient resolution of a bug requires two main phases: the identification of the bug and the correction of the bug. Programmers always attack during the second phase, ignoring the first one, which is the bad reflex to avoid. The first phase is more important than the second one, as very often the correction is not very difficult to implement once the root cause of the bug is identified, unless there are some serious conceptual or architectural issues.
Phase 1: Identification of the Bug
Before starting its resolution, let's fix what is a bug. A bug is when your expected behavior is not the one you get. It is the gap between the specification and the implementation. To identify the root cause of a bug, you have to observe it and understand it by following the steps below.
Step 0: Don't Panic!
Even if the bug is in production and it occurs on Friday just before leaving the office, do NOT panic!
Step 1: Replicate the Bug
To be more precise, this step is about the replication of the conditions that led to the bug. Sometimes, it is very quick and easy to replicate, sometimes it is not. In this case, you must collect the maximum amount of information about the state of the system when the bug occurred. To do so, the logs are your best friend as they keep track of what went wrong. You can also refer to the person who detected the bug in case of, and god forbid, you find yourself without logs.
Another good intuition to have is to look at the correlations. For example, you check if the bug is periodic or if it coincides with another event or if it happens only when the software is executed under a particular configuration. This first step helps to understand the bug, and it is also useful when it comes to evaluating later on if it has been corrected or not, since you only have to replicate the same conditions leading to the initial bug and verify that it no longer occurs.
Step 2: Understand the Bug
This step is highly correlated with the previous one. The different mechanics you used to replicate the bug can give you a clear understanding of the issue; however, in some cases, they don't.
Replication does not always mean you have already fully understood the root cause of the bug. For example, if you have a defect every time you click on a submit button on a web application, by replicating the issue with just a new click of that button, it does not mean you succeeded in understanding the error. Generally, the replication of the bug from a graphic interface is not very helpful. So, you need to dig deeper, and it is very important at this point to avoid making assumptions that may lead you away from your target. Base your analysis only on facts.
Step 3: Understand the Expected Behavior
With the previous steps, you get an answer to the following question: why did the bug occur? In this step, you have to answer a different one — what is supposed to happen if the bug does not occur?
To answer this, you have to go back and collect information. You can refer to the documentation, the specification, the functional expert, or the project manager. Unless the guy who developed that part of the application has quit his job without leaving any kind of documentation, you will find that it is easy to find the information you asked for. However, you need to validate again the expected behavior with your project manager to be sure that the bug is a priority to spend time fixing it (blocking or major one). In some particular cases, you will find yourself discovering that it is not even a bug.
Step 4: Delimit the Problem
The purpose of this step is to locate the exact part of the code that causes the bug. It is a repetitive process; you start by identifying the module and you continue your investigation until you find the naughty function with the nasty line of code causing the issue.
The following techniques can help you delimit the problem:
Use the logs: if you are lucky, logs will point directly to the right part of code causing the issue
Add logs if they do not exist
Refactor the code to understand it easily
Eliminate the hardware correlation
Add unit tests to verify the implementation of some parts of the code
Use mocking to simulate the result of big parts of the code
If you don't manage to delimit the problem, then it is very likely that your application has some serious conceptual or architectural issues; the kind of issues that will consume too much time to resolve.
If it is the case, you have the right to panic!!
Sometimes, the line of code raising the exception is innocent — that is why you need to audit the blocks of code near that line and the ones in interaction with it, which takes us the next step.
Step 5: Audit the Code
Once you identify the block of code raising the issue, you need to inspect the other parts of the code (functions, modules, etc.) in interaction with it for the following reasons:
These parts may be the source of the issue; your bug may be due to bad data passed from another function or module to the block in question
The cause of your bug in the identified block of code may have the same effect on them as they are in interaction
The correction you intend to implement in phase two may have side effects on these parts. It is better to detect them at this point and avoid them.
Phase 2: The Correction of the Bug
After taking enough time to understand and identify the bug, you can now tackle this phase.
Step 6: Implement the Correction
For this step, I am sorry, but I don't have as much to say.
You are a programmer, so do your job and fix the bug that you have happily identified in the previous phase. Good luck!
Step 7: Test the Correction
If a bug occurs and it is so pertinent to the user that they reported it, it is likely because some important test cases are missing. So, yes, it is the perfect time to add them. These new test cases have two main purposes:
Validate the correction you made: the idea is to replicate, once again, the conditions that led to the bug (conditions identified in step one) and confirm that, this time, rather than having the bug, as a result, you have the expected behavior that you have already well understood in step three.
Validate that no regressions resulted from your correction. The last thing we want is to create new bugs while correcting one. The result of the audit of the code in step five is to be exploited at this level and to write the proper tests.
For the tests, use the proper ones, depending on the scope of your bug. This could involve unit tests, integration tests, mutation tests, etc. These are all welcome to join the party!
Clean and Report
This part of the resolution of a bug is done along with phase one and two. It is important; however, it is not always required. But it is the best practice.
Step 8: Clean the Code
The idea is that, during the resolution of the bug, you proceed while cleaning and boosting the quality of the existing code. While digging in a particular part or functionality of the application, you become a sort of expert in it, so why not profit and correct weaknesses that may cause problems in the future?
For example, you can clean the code by adding logs if they don't exist or refactor any duplicated parts you detect. Just formatting the code to ameliorate its readability presents a great help for the next programmer who will work on that part in the future.
Note that this cleaning step starts from step four when we delimit the issue.
Step 9: Report the Bug
Use a tracing system for bugs, record the bug by generating a bug report that must contain at least the description of the bug, the causes behind it, the expected behavior, and the resolution.
These systems are very useful in the case of your team to ensure support activities, as the bug may occur again and it will be easily tracked. With these systems, you can delegate some part of the resolution to other team members without losing the control of the changes made or the evolution of the state of the resolution of the bug.
Jira is one of the most-used tracing systems, and it allows easy communication between the reporter of the bug and the responsibility for the resolution.
Conclusion
There is not a standardized process for resolving a bug. However, this does not mean that you do it on the fly. The main thing to remember is that you need to think twice and act once; spend the necessary time on the identification phase and do not attack the second one blindly.
Opinions expressed by DZone contributors are their own.
Comments