4 More Techniques for Writing Better Java
When it comes to optimizing your Java code, here are four ways to improve performance and the readability of your projects.
Join the DZone community and get the full member experience.
Join For FreeThe majority of our day-to-day programming tasks consist of applying the same suite of techniques, and for the majority of cases, these go-to techniques suffice in accomplishing our goals. There are times, however, when we need to go beyond the normal techniques and delve into the toolbox to find the strategy that will solve the problem using the simplest means possible. In the previous article in this series, we delved into four particular techniques that can be used in a pinch to create better Java software; in this article, we will delve into some general design strategies and targeted implementation techniques that will help in solving both common and nuanced problems, namely:
- Only perform target optimizations
- Favor enums over constants
- Define an
equals()
method - Favor polymorphism over conditionals
It is important to note that not every technique described in this article will be applicable in all situations. We are not in the business of creating always or never rules. Instead, each of these techniques requires mature and experienced judgment to discern when and where to apply them.
1. Only Perform Targeted Optimizations
Performance concerns are at the core of almost every large-scale software system. While we desire to produce the most efficient code we can, many times it is difficult to find the places in our code where we can perform optimizations that actually make a difference. For example, is the following code a performance bottleneck?
public void processIntegers(List<Integer> integers) {
for (Integer value: integers) {
for (int i = integers.size() - 1; i >= 0; i--) {
value += integers.get(i);
}
}
}
It depends. By inspection, we can see that our processing algorithm is O(n3) (using Big-O notation), where n is the size of the integers list. If n is only 5, there is no real issue, as only 25 logical iterations are performed. If on the other hand, n is 100,000, then we may have a problem on our hands. Notice that we still do not know for sure whether we have a problem. Although this method would require 10,000,000,000 logic iterations, this still does not guarantee that will have a performance issue.
For example, if the client that calls our method above executes it within its own thread and asynchronously waits for the computation to complete, it may very well execute in an acceptable amount of time. Likewise, if no client actually calls this method when the system is deployed in a production environment, then any optimization we make will have been unnecessary and will not have contributed to increasing the overall performance of our system. In fact, we will have made the system more complicated by making our performance optimization, while at the same time, not increasing the performance of the system.
It is important to note that very few optimizations come for free: Instead, we usually implement techniques, such as caching, loop unrolling, or pre-computing values (such as the creation of a look-up table), that adds to the complexity of our system and commonly reduces the readability of our code. If our optimization actually increases the performance of our system, then this increased complexity may pay off, but in order to make an educated decision, we must know two pieces of information first:
- Our performance requirements
- Where the performance bottlenecks are located
In the first case, we need to know the performance envelope that our system is required to operate within. If we are within that envelope and there are no complaints from the end users of the system, there is likely no need to make a performance optimization. There may come a point down the road, though, when new functionality is added or the data size of our system increases and we are forced to make such optimizations.
In that case, we should not go by our gut or even inspection. Even the best software engineers are prone to target the wrong optimizations in a system. Even experienced developers such as Martin Fowler is susceptible to repeating these mistakes, as explained in Refactoring (p. 70):
The interesting thing about performance is that if you analyze most programs, you find that they waste most of their time in a small fraction of code. If you optimize all the code equally, you end up with 90 percent of the optimizations wasted, because you are optimizing code that isn't run much. The time spent making the program fast, the time lost because of lack of clarity, is all wasted time.
That is a striking bit of wisdom and one that we as mature developers should take seriously. Not only will most of our first guesses at not improve the performance of our system, 90% of them will be a sheer waste of development time. Instead, we should profile the system as we execute common use cases in a production environment (or an environment that sufficiently mimics the production environment) and find where the bulk of the system resources are used during execution. If, for example, a majority of the execution time is spent in 10% of the code, then optimizing the other 90% of the code is simply a waste.
Using that knowledge, we should start with the highest culprit, according to the profiling results. This will ensure that we actually increase the performance of our system in a meaningful way. After each optimization, the profiling step should be repeated. This allows us not only to ensure that we have actually improved the performance of our system but also to see where the performance bottlenecks are located after we have improved a portion of the system (since one bottleneck is removed, others may now consume more of the overall resources of the system). It is important to note that the percentage of time spent in the existing bottlenecks will likely increase since remaining bottlenecks are temporally constant and the overall execution time should have decreased with the removal of the targeted bottleneck.
Although a full examination of profiling in a Java system requires volumes, there are some very common tools that can aid in discovering the performance hot-spots of a system, including JMeter, AppDynamics, and YourKit. Also, see DZone's own Guide to Performance and Monitoring for more general information on performance and optimizations for Java programs.
While performance is a very important part of any large-scale software system and should be part of the automated suite of tests included in the delivery pipeline of a product, optimizations should not be blind or untargeted. Instead, optimizations should be made to specific portions of code that are known to be performance bottlenecks. Not only does this stop us from adding complexity to our systems with minimal payoffs, it also stops us from misusing our valuable development time.
2. Favor Enums Over Constants
There are many cases where we want to list a set of predefined or constant values, such as the HTTP response codes we may encounter in a web application. One of the most common implementation techniques is to create a class with a series of static final values, each with some descriptive name:
public class HttpResponseCodes {
public static final int OK = 200;
public static final int NOT_FOUND = 404;
public static final int FORBIDDEN = 403;
}
if (getHttpResponse().getStatusCode() == HttpResponseCodes.OK) {
// Do something if the response code is OK
}
While this suffices, it has some series drawbacks:
- We can pass any int where a response code is expected
- We cannot call methods on our status codes since they are primitive values
In the first case, we are simply creating specific constants to represent special integer values, but this does not restrict a method or variable to use only the status codes we have defined. For example:
public class HttpResponseHandler {
public static void printMessage(int statusCode) {
System.out.println("Recieved status of " + statusCode);
}
}
HttpResponseHandler.printMessage(15000);
Although 15000
is not a valid HTTP response code, we have not restricted clients from supplying any valid integer. In the second case, we have no way of internally defining methods for our status code. For example, if we wanted to check if a given status code is a success code (a 200-level status code), we have to define a separate function:
public class HttpResponseCodes {
public static final int OK = 200;
public static final int NOT_FOUND = 404;
public static final int FORBIDDEN = 403;
public static boolean isSuccess(int statusCode) {
return statusCode >= 200 && statusCode < 300;
}
}
if (HttpResponseCodes.isSuccess(getHttpResponse().getStatusCode())) {
// Do something if the response code is a success code
}
In order to resolve these issues, we would need to change our constant type from a primitive to a custom class and allow only specific objects of our custom class. This exactly the purpose of Java enumerations (enum). Using an enum, we can solve both these issues with a single stroke:
public enum HttpResponseCodes {
OK(200),
FORBIDDEN(403),
NOT_FOUND(404);
private final int code;
HttpResponseCodes(int code) {
this.code = code;
}
public int getCode() {
return code;
}
public boolean isSuccess() {
return code >= 200 && code < 300;
}
}
if (getHttpResponse().getStatusCode().isSuccess()) {
// Do something if the response code is a success code
}
Likewise, we can now restrict callers to only supply valid status codes in a method call:
public class HttpResponseHandler {
public static void printMessage(HttpResponseCode statusCode) {
System.out.println("Recieved status of " + statusCode.getCode());
}
}
HttpResponseHandler.printMessage(HttpResponseCode.OK);
It is important to note that this technique suggests that we should favor enumerations over constants, not use enumerations indiscriminately. In some cases, we may wish to use a constant to represent a special value, but allow other values to be supplied. For example, if we have a well-known numeric value, we can capture that value (and reuse it) with a constant:
public class NumericConstants {
public static final double PI = 3.14;
public static final double UNIT_CIRCLE_AREA = PI;
}
public class Rug {
private final double area;
public class Run(double area) {
this.area = area;
}
public double getCost() {
return area * 2;
}
}
// Create a carpet that is 4 feet in diameter (radius of 2 feet)
Rug fourFootRug = new Rug(2 * NumericConstants.UNIT_CIRCLE_AREA);
The rule for using enumerations over constants can, therefore, be distilled into:
Use enumerations when all possible discrete values are known a priori
In the case of our HTTP response codes, we know all possible values for HTTP status codes (found in RFC 7231, which defines the HTTP 1.1 protocol). Therefore, we used an enumeration. In the case of our area calculation, we do not know all possible values of the area that can be supplied (any possible double is valid), but at the same time, we wished to create a constant for circular rugs that made the calculation easier (and more readable); therefore, we defined a series of constants.
If we did not know all possible values ahead of time but wished to include fields or methods for each value, we could simply create a new class to represent our data. Although there is no always or never rule for using enumerations, the key to knowing when and when not to use enumerations is being aware of all values ahead of time and forbidding the use of any others.
3. Define an equals() Method
Identity can be a difficult issue to solve: Are two objects the same if they occupy the same location in memory? Are they the same if their IDs are the same? Or are they the same if all of their fields are equal? Although each class may have its own identity logic, there is a tendency to proliferate identity checks throughout different places in a system. For example, if we have the following class for a purchase order...
public class Purchase {
private long id;
public long getId() {
return id;
}
public void setId(long id) {
this.id = id;
}
}
...there is a common tendency to have conditionals, such as the following, repeated in multiple places through the code:
Purchase originalPurchase = new Purchase();
Purchase updatedPurchase = new Purchase();
if (originalPurchase.getId() == updatedPurchase.getId()) {
// Execute some logic for equal purchases
}
The more we repeat these calls (and in turn, violate the DRY Principle), the more we spread the knowledge of Purchase
's identity. If for some reason, we change the identity logic for our Purchase
class (for example, change the type of our identifier), we need to update the plethora of locations that our identity logic is located.
Instead of spreading the identity logic for our Purchase
class through our system, we should internalize this logic within our class. At first glance, we could create a new method, such as isSame
, that accepts a Purchase
object and compares the IDs of each to see if they are the same:
public class Purchase {
private long id;
public boolean isSame(Purchase other) {
return getId() == other.gerId();
}
}
While this is a valid solution, we are ignoring the built-in functionality of Java: using the equals
method. Every class, by nature of implicitly implementing the Object
class, inherits the equals
method. By default, this method checks for object identity (the same object in memory) against the supplied object, as illustrated in the following snippet from the Object
class definition in the JDK (version 1.8.0_131):
public boolean equals(Object obj) {
return (this == obj);
}
This equals method serves as a natural place to inject our identity logic (by overriding the default equals
implementation):
public class Purchase {
private long id;
public long getId() {
return id;
}
public void setId(long id) {
this.id = id;
}
@Override
public boolean equals(Object other) {
if (this == other) {
return true;
}
else if (!(other instanceof Purchase)) {
return false;
}
else {
return ((Purchase) other).getId() == getId();
}
}
}
Although this equals
method may appear complicated, since the equals
method only accepts arguments of type Object, there are only three cases that we must account for:
- The other object is the current object (i.e.
originalPurchase.equals(originalPurchase)
), which is by definition the same object, and thus we returntrue
- The other object is not a
Purchase
object, in which case we cannot compare the IDs of the purchases, and therefore, the two objects are not equal - The other object is not the same object and it is a
Purchase
, thus the equality depends on the IDs of the currentPurchase
and the otherPurchase
being equal (we much downcast the other object to aPurchase
, but this is a safe cast since we know that the other object is in actuality aPurchase
object)
We can now refactor our previous conditional to the following:
Purchase originalPurchase = new Purchase();
Purchase updatedPurchase = new Purchase();
if (originalPurchase.equals(updatedPurchase)) {
// Execute some logic for equal purchases
}
Apart from reducing the duplication in our system, there are also some extra advantages to overriding the default equals
method. For example, if we construct a list of Purchase
objects and check to see if the list contains another Purchase
object with the same ID (different object in memory), we receive a value of true
, since the two are considered equal:
List<Purchase> purchases = new ArrayList<>();
purchases.add(originalPurchase);
purchases.contains(updatedPurchase); // True
In general, anywhere that the Java language wishes to test the equality of two objects, it will use the equals method that we have overridden. If we wish to use the default object equality check that we have hidden in our overridden implementation, we can still do so by using the ==
operator as follows:
if (originalPurchase == updatedPurchase) {
// The two objects are the same objects in memory
}
It is also important to note that when the equals
method is overridden, the hashCode
method should also be overridden. For more information on the relationship between these two methods and how to properly define the hashCode method, see this thread.
As we have seen, not only does overriding the equals
method internalize the identity logic for a class and reduce the proliferation of that logic throughout our system, it also allows the Java language make educated decisions about our classes.
4. Favor Polymorphism Over Conditionals
Conditionals are a ubiquitous part of any programming language and for good reason. Their inclusion allows us to change the behavior of our system based the instantaneous state of a given value or object. For example, if we were to calculate the balance for various bank accounts, we could develop the following:
public enum BankAccountType {
CHECKING,
SAVINGS,
CERTIFICATE_OF_DEPOSIT;
}
public class BankAccount {
private final BankAccountType type;
public BankAccount(BankAccountType type) {
this.type = type;
}
public double getInterestRate() {
switch(type) {
case CHECKING:
return 0.03; // 3%
case SAVINGS:
return 0.04; // 4%
case CERTIFICATE_OF_DEPOSIT:
return 0.05; // 5%
default:
throw new UnsupportedOperationException();
}
}
public boolean supportsDeposits() {
switch(type) {
case CHECKING:
return true;
case SAVINGS:
return true;
case CERTIFICATE_OF_DEPOSIT:
return false;
default:
throw new UnsupportedOperationException();
}
}
}
While this means of calculating earned interest and differentiating if an account supports deposits is sufficient for a simple program, it betrays a noticeable flaw: We are deciding on the behavior of our system based on the type of a given account. Not only does this require that we check the type each time we wish to make a decision, it also requires that we repeat this logic each and every time we need to make that decision. For example, in the above design, we have to make this check in both methods. This can become unruly, especially when we receive a requirement to add a new account type.
Instead of using the account type as the differentiator, we can use polymorphism to implicitly make that decision. In order to do this, we transform the BankAccount
concrete class into an interface and push the decision-making into a series of concrete classes that represent each type of bank account:
public interface BankAccount {
public double getInterestRate();
public boolean supportsDeposits();
}
public class CheckingAccount implements BankAccount {
@Override
public double getIntestRate() {
return 0.03;
}
@Override
public boolean supportsDeposits() {
return true;
}
}
public class SavingsAccount implements BankAccount {
@Override
public double getIntestRate() {
return 0.04;
}
@Override
public boolean supportsDeposits() {
return true;
}
}
public class CertificateOfDepositAccount implements BankAccount {
@Override
public double getIntestRate() {
return 0.05;
}
@Override
public boolean supportsDeposits() {
return false;
}
}
Not only does this decentralize the knowledge of each account into its own class, but it allows our design to vary in two important ways. First, if we wanted to add a new bank account type, we simply create a new concrete class that implements the BankAccount
interface and supply implementations for both interface methods. In the conditional design, we would have had to add a new value in our enumeration, add new case statements in both methods, and insert logic for the new account under each case statement.
Second, if we wish to add a new method to the BankAccount
interface, we simply add the new method in each of the concrete classes. In the conditional design, we would have to duplicate the existing switch statement and add it to our new method. Furthermore, we then would have to add the logic for each account type within each case statement.
Mathematically, when we create a new method or add a new type, we have to make the same number of logical changes in both the polymorphic and conditional designs. For example, if we add a new method, in the polymorphic design, we have to add the new method to all n bank account concrete classes, while in the conditional design, we have to add n new case statements to our new method. If we add a new account type, in the polymorphic design, we must implement all m number of methods in the BankAccount
interface, while in the conditional design, we must add a new case statement to each of the m existing methods.
While the number of changes we must make is equal, the nature of the changes is very different. In the polymorphic design, if we add a new account type and forget to include a method, the compiler will throw an error, since we did not implement all of the methods in our BankAccount
interface. In the conditional design, there is no such check to ensure that we have a case statement for each of the types. We can simply forget to update each of the switch statements if a new type is added. This problem is exacerbated the more we duplicate our switch statements. We are humans and we are prone to make mistakes. Thus, anytime we can rely on the compiler to alert us of our mistakes, we would be wise to do so.
A second important note about these two designs is that they are externally equivalent. For example, if we wanted to check the interest rate for a checking account, the conditional design would resemble the following:
BankAccount checkingAccount = new BankAccount(BankAccountType.CHECKING);
System.out.println(checkingAccount.getInterestRate()); // Output: 0.03
Conversely, the polymorphic design would resemble the following:
BankAccount checkingAccount = new CheckingAccount();
System.out.println(checkingAccount.getInterestRate()); // Output: 0.03
From an outside perspective, we are simply calling getInterestRate()
on a BankAccount
object. This is made even more obvious if we abstract the creation process into a factory class:
public class ConditionalAccountFactory {
public static BankAccount createCheckingAccount() {
return new BankAccount(BankAccountType.CHECKING);
}
}
public class PolymorphicAccountFactory {
public static BankAccount createCheckingAccount() {
return new CheckingAccount();
}
}
// In both cases, we create the accounts using a factory
BankAccount conditionalCheckingAccount = ConditionalAccountFactory.createCheckingAccount();
BankAccount polymorphicCheckingAccount = PolymorphicAccountFactory.createCheckingAccount();
// In both cases, the call to obtain the interest rate is the same
System.out.println(conditionalCheckingAccount.getInterestRate()); // Output: 0.03
System.out.println(polymorphicCheckingAccount.getInterestRate()); // Output: 0.03
The replacement of conditional logic with polymorphic classes is so common that there are published procedures for refactoring conditional statements into polymorphic classes. A simple example can be found here. In addition, p. 255 of Refactoring, by Martin Fowler, describes a detailed process for conducting this refactor.
Just as with the other techniques in this article, there is no hard or fast rule for when to perform the conversion from conditional logic to polymorphic classes. As a matter of fact, it is not suggested in all cases. In Test Driven Design: By Example, Kent Beck designs a simple currency system with the intent of using polymorphic classes, but finding that this over-complicates the design, he refactors his design to a non-polymorphic style. Experience and sound judgment will dictate when is the right time to make the switch from conditional code to polymorphic code.
Conclusion
Although the normal and daily techniques that we use as programmers suffice for most of the tasks that are laid before us, there are times when we need to leave the beaten path and delve into the toolbox. At times, this may require making a sound performance optimization decision or deciding between polymorphic hierarchies or conditionals. In other cases, this may mean creating restrictions on our constants or explicitly defining the identity of an object using the equals
method. Whatever the case, expanding breadth and depth of our knowledge aids in not only making more educated decisions as developers but also making wiser developers.
Opinions expressed by DZone contributors are their own.
Trending
-
How to Submit a Post to DZone
-
How to LINQ Between Java and SQL With JPAStreamer
-
Avoiding Pitfalls With Java Optional: Common Mistakes and How To Fix Them [Video]
-
Extending Java APIs: Add Missing Features Without the Hassle
Comments