{{announcement.body}}
{{announcement.title}}

PVS-Studio for Java Hits the Road. Next Stop Is Elasticsearch.

DZone 's Guide to

PVS-Studio for Java Hits the Road. Next Stop Is Elasticsearch.

It's time to hit the road, PVS-Studio.

· Java Zone ·
Free Resource

The PVS-Studio team has been keeping the blog about the checks of open-source projects by the same-name static code analyzer for many years. To date, more than 300 projects have been checked, the base of errors contains more than 12000 cases. Initially, the analyzer was implemented for checking C and C++ code, support of C# was added later. Therefore, from all checked projects the majority (> 80 percent) accounts for C and C++. Quite recently, Java was added to the list of supported languages, which means that there is now a whole new open world for PVS-Studio, so it's time to complement the base with errors from Java projects.

The Java world is vast and varied, so one doesn't even know where to look first when choosing a project to test the new analyzer. Ultimately, the choice fell on the full-text search and analytical engine Elasticsearch. It is quite a successful project, and it's even especially pleasant to find errors in significant projects. So, what defects did PVS-Studio for Java manage to detect? Further talk will be right about the results of the check.

Picture 1

Briefly About Elasticsearch

Elasticsearch is a distributed, RESTful search, and analytics engine with open-source code and is capable of solving a growing number of use cases. It enables you to store large amounts of data, carry out a quick search, and analytics (almost in real-time mode). Typically, it is used as the underlying mechanism/technology, which provides applications with complex functions and search requirements.

Among the major sites using Elasticsearch, there are Wikimedia, StumbleUpon, Quora, Foursquare, SoundCloud, GitHub, Netflix, Amazon, IBM, and Qbox.

Fine, enough with the introduction.

The Whole Story of How Things Were

There were no problems with the check itself. The sequence of actions is rather simple and didn't take much time:

  • Downloaded Elasticsearch from GitHub
  • Followed the instructions on how to run the Java analyzer and run the analysis
  • Received the analyzer's report, delved into it, and pointed out interesting cases.

Now, let's move on to the main point.

Watch out! Possible NullPointerException

V6008: Null dereference of line.

GoogleCloudStorageFixture.java(451):

private static PathTrie<RequestHandler> defaultHandlers(....) { 
  ....
  handlers.insert("POST /batch/storage/v1", (request) -> { 
    .... 
    // Reads the body 
    line = reader.readLine();
    byte[] batchedBody = new byte[0];
    if ((line != null) ||
        (line.startsWith("--" + boundary) == false)) // <=
    { 
      batchedBody = line.getBytes(StandardCharsets.UTF_8);
    } 
    .... 
  });
 ....
}


The error in this code fragment is that if the string from the buffer wasn't read, the call of the startsWith method in the condition of the if statement will result in throwing the NullPointerException exception. Most likely, this is a typo, and when writing a condition, developers meant the && operator instead of||.

V6008: Potential null dereference of followIndexMetadata.

TransportResumeFollowAction.java(171), TransportResumeFollowAction.java(170), TransportResumeFollowAction.java(194):

void start(
           ResumeFollowAction.Request request,
           String clusterNameAlias,
           IndexMetaData leaderIndexMetadata,
           IndexMetaData followIndexMetadata,
           ....) throws IOException
{ 
  MapperService mapperService = followIndexMetadata != null // <= 
                                ? ....
                                : null;
  validate(request,
           leaderIndexMetadata,
           followIndexMetadata, // <=
           leaderIndexHistoryUUIDs,
           mapperService);
  ....
}


Another warning from the V6008 diagnostic. The object followIndexMetadata kindled my interest. The start method accepts several arguments as input, our suspect is right among them. After that, based on checking our object for null, a new object is created, which is involved in further method logic. Check for null shows us that followIndexMetadata can still come from the outside as a null object. Well, let's look further.

Then, multiple arguments are pushed to the validate method (again, there is our considered object among them). If we look at the implementation of the validation method, it all falls into place. Our potential null object is passed to the validate method as a third argument, where it unconditionally gets dereferenced. This gives us a potential NullPointerException as a result.

static void validate(
      final ResumeFollowAction.Request request,
      final IndexMetaData leaderIndex,
      final IndexMetaData followIndex, // <=
      ....)
{
  ....
  Map<String, String> ccrIndexMetadata = followIndex.getCustomData(....); // <=
  if (ccrIndexMetadata == null) {
    throw new IllegalArgumentException(....);
  }
  ....
}


We don't know for sure with what arguments the start method is called. It is quite possible that all arguments are checked somewhere before calling the method and no null object dereference will happen. Anyway, we should admit that such code implementation still looks unreliable and deserves attention.

V6060: The node reference was utilized before it was verified against null.

RestTasksAction.java(152), RestTasksAction.java(151):

private void buildRow(Table table, boolean fullId,
                      boolean detailed, DiscoveryNodes discoveryNodes,
                      TaskInfo taskInfo) { 
  ....
  DiscoveryNode node = discoveryNodes.get(nodeId);
  .... 
  // Node information. Note that the node may be null because it has
  // left the cluster between when we got this response and now.
  table.addCell(fullId ? nodeId : Strings.substring(nodeId, 0, 4));
  table.addCell(node == null ? "-" : node.getHostAddress());
  table.addCell(node.getAddress().address().getPort());
  table.addCell(node == null ? "-" : node.getName());
  table.addCell(node == null ? "-" : node.getVersion().toString());
  ....
}


Another diagnostic rule with the same problem triggered here — NullPointerException. The rule cries out for developers: "Guys, what are you doing? How could you do that? Oh, it is awful! Why do you first use the object and check if for null in the next line? Here is how null object dereference happens. Alas, even a developer's comment didn't help.

V6060: The cause reference was utilized before it was verified against null.

StartupException.java(76), StartupException.java(73):

private void printStackTrace(Consumer<String> consumer) {
  Throwable originalCause = getCause();
  Throwable cause = originalCause;
  if (cause instanceof CreationException) {
    cause = getFirstGuiceCause((CreationException)cause);
  }
  String message = cause.toString(); // <=
  consumer.accept(message);
  if (cause != null) { // <=
    // walk to the root cause
    while (cause.getCause() != null) {
      cause = cause.getCause();
    } 
    ....
  }
  ....
}


In this case, we should take into account that the getCause method of the Throwable class might return null. The above problem repeats further, so its explanation is needless.

Meaningless Conditions

V6007 Expression 's.charAt(i) != '\t'' is always true. Cron.java(1223):

private static int findNextWhiteSpace(int i, String s) {
  for (; i < s.length() && (s.charAt(i) != ' ' || s.charAt(i) != '\t'); i++)
  {
    // intentionally empty
  } 
  return i;
}


The considered function returns the index of the first space character, starting from the i index. What's wrong? We have the analyzer warning that s.charAt(i) != '\t' is always true, which means the expression (s.charAt(i) != ' ' || s.charAt(i) != '\t') will always be true as well. Is this true? I think you can easily make sure of this by substituting any character.

As a result, this method will always return the index, equal to  s.length(), which is wrong. I would venture to suggest that the above method is to blame here:

private static int skipWhiteSpace(int i, String s) {
  for (; i < s.length() && (s.charAt(i) == ' ' || s.charAt(i) == '\t'); i++)
  {
    // intentionally empty
  }
  return i;
}


Developers implemented the method, then copied and got our erroneous method findNextWhiteSpace, having made some edits. They kept fixing and fixing the method but haven't fixed it up. To get this right, one should use the  && operator instead of ||.

V6007: Expression 'remaining == 0' is always false.

PemUtils.java(439):

private static byte[]
generateOpenSslKey(char[] password, byte[] salt, int keyLength)
{
  ....
  int copied = 0;
  int remaining;
  while (copied < keyLength)
  {
    remaining = keyLength - copied;
    ....
    copied += bytesToCopy;
    if (remaining == 0) { // <=
      break;
    }
    ....
  }
  ....
}


From the condition of the copied < keyLength loop, we can note that copied will always be less than keyLength. Hence, it is pointless to compare the remaining variable with 0, and it will always be false, at which point the loop won't exit by a condition. Remove the code or reconsider the behavior logic? I think that only developers will be able to put all the dots over the i.

V6007: Expression 'healthCheckDn.indexOf('=') > 0' is always false.

ActiveDirectorySessionFactory.java(73):

 ActiveDirectorySessionFactory(RealmConfig config,
                               SSLService sslService,
                               ThreadPool threadPool)
                               throws LDAPException
{
  super(....,
        () -> {
          if (....) {
            final String healthCheckDn = ....;
            if (healthCheckDn.isEmpty() &&
                healthCheckDn.indexOf('=') > 0)
            { 
              return healthCheckDn;
            }
          }
          return ....;
        },
        ....);
  ....
}


Meaningless expression again. According to the condition, the healthCheckDn string has to be both empty and contain the character '=' not in the first position, so that lambda expression would return the string variable healthCheckDn. Phew, that's all then! As you probably understood, it is impossible. We're not going to go deep in the code, let's leave it to developers' discretion.

I cited only some of the erroneous examples, but beyond that, there was plenty of the V6007 diagnostic triggerings, which should be regarded one by one, making relevant conclusions.

A Little Method Can Go a Long Way

private static byte char64(char x) {
  if ((int)x < 0 || (int)x > index_64.length)
    return -1;
  return index_64[(int)x];
}


So, here, we have a teeny-tiny method of several lines. But bugs are on the watch! Analysis of this method gave the following result:

  • V6007 Expression '(int)x < 0' is always false. BCrypt.java(429)
  • V6025 Possibly index '(int) x' is out of bounds. BCrypt.java(431)

Issue N1. The expression (int)x < 0 is always false (Yes, V6007 again). The x variable cannot be negative, as it is of the char type. The char type is an unsigned integer. It cannot be called a real error, but, nonetheless, the check is redundant and can be removed.

Issue N2. Possible array index out of bounds, resulting in the ArrayIndexOutOfBoundsException exception. Then, it begs the question: "Wait, how about the index check?"

So, we have a fixed-size array of 128 elements:

private static final byte index_64[] = {
  -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
  -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
  -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
  -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
  -1, -1, -1, -1, -1, -1, 0, 1, 54, 55,
  56, 57, 58, 59, 60, 61, 62, 63, -1, -1,
  -1, -1, -1, -1, -1, 2, 3, 4, 5, 6,
  7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
  17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27,
  -1, -1, -1, -1, -1, -1, 28, 29, 30,
  31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
  41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
  51, 52, 53, -1, -1, -1, -1, -1
};


When the char64 method receives the x variable, the index validity gets checked. Where is the flaw? Why is array index out of bounds still possible?

The check (int)x > index_64.length is not quite correct. If the char64 method receives x with the value 128, the check won't protect against ArrayIndexOutOfBoundsException. Maybe this never happens in reality. However, the check is written incorrectly, and one has to change"greater than" operator (>) with "greater than or equal to (> =).

Comparisons, Which Did Their Best

V6013: Numbers displaySize and that.displaySize are compared by reference. Possibly an equality comparison was intended.

ColumnInfo.java(122):

....
private final String table;
private final String name;
private final String esType;
private final Integer displaySize;
....
@Override
public boolean equals(Object o)
{
  if (this == o) {
    return true;
  } 
  if (o == null || getClass() != o.getClass()) {
    return false;
  } 
  ColumnInfo that = (ColumnInfo) o;
  return displaySize == that.displaySize && // <=
         Objects.equals(table, that.table) &&
         Objects.equals(name, that.name) &&
         Objects.equals(esType, that.esType);
}


What is incorrect here is that displaySize objects of the Integer type are compared using the == operator, that is, by reference. It's quite possible that ColumnInfo objects, whose displaySize fields have different references, but the same content will be compared. In this case, the comparison will give us a negative result when we expected to get true.

I would venture to guess that such a comparison could be the result of a failed refactoring and initially the displaySize field was of the int type.

V6058: The equals function compares objects of incompatible types: IntegerTimeValue.

DatafeedUpdate.java(375):

....
private final TimeValue queryDelay;
private final TimeValue frequency;
....
private final Integer scrollSize;
....
boolean isNoop(DatafeedConfig datafeed)
{
  return (frequency == null
          || Objects.equals(frequency, datafeed.getFrequency()))
    && (queryDelay == null
        || Objects.equals(queryDelay, datafeed.getQueryDelay()))
    && (scrollSize == null
        || Objects.equals(scrollSize, datafeed.getQueryDelay())) // <=
    && ....)
}


Incorrect objects comparison again. This time, objects with incompatible types are compared (Integerand TimeValue). The result of this comparison is obvious, and it is always false. You can see that class fields are compared with each other, only the names of the fields have to be changed. Here is the point — a developer decided to speed up the process by using the copy-paste and got a bug into the bargain. The class implements a getter for the scrollSize field, so to fix the error one should use the method  datafeed .getScrollSize().

Let's look at a couple of erroneous examples without any explanation. Problems are quite obvious.

V6001: There are identical sub-expressions tookInMillis to the left and to the right of the == operator.

TermVectorsResponse.java(152):

@Override public boolean equals(Object obj) {
  ....
  return index.equals(other.index)
      && type.equals(other.type)
      && Objects.equals(id, other.id)
      && docVersion == other.docVersion
      && found == other.found
      && tookInMillis == tookInMillis // <=
      && Objects.equals(termVectorList, other.termVectorList);
}


V6009: Function equals receives an odd argument. An object shardId.getIndexName() is used as an argument to its own method.

SnapshotShardFailure.java(208):

@Override public boolean equals(Object o) {
  ....
  return shardId.id() == that.shardId.id() &&
         shardId.getIndexName().equals(shardId.getIndexName()) && // <=
         Objects.equals(reason, that.reason) &&
         Objects.equals(nodeId, that.nodeId) &&
         status.getStatus() == that.status.getStatus();
}


Miscellaneous

V6006: The object was created but it is not being used. The throw keyword could be missing.

JdbcConnection.java(88):

@Override public void setAutoCommit(boolean autoCommit) throws SQLException {
  checkOpen();
  if (!autoCommit) {
    new SQLFeatureNotSupportedException(....); 
  }
}


The bug is obvious and requires no explanation. A developer created an exception but didn't throw it anywhere else. Such an anonymous exception is created successfully, as well as, most importantly, will be removed seamlessly. The reason is the lack of the throw operator.

V6003: The use of 'if (A) {....} else if (A) {....}' pattern was detected. There is a probability of logical error presence.

MockScriptEngine.java(94), MockScriptEngine.java(105):

@Override public <T> T compile(....) {
  ....
  if (context.instanceClazz.equals(FieldScript.class)) {
    ....
  } else if (context.instanceClazz.equals(FieldScript.class)) {
    ....
  } else if(context.instanceClazz.equals(TermsSetQueryScript.class)) {
    ....
  } else if (context.instanceClazz.equals(NumberSortScript.class))
    ....
}


In multiple if-else one of the conditions is repeated twice, the situation requires competent code review.

V6039: There are two  if statements with identical conditional expressions. The first if statement contains a method return. This means that the second if statement is senseless.

SearchAfterBuilder.java(94), SearchAfterBuilder.java(93):

public SearchAfterBuilder setSortValues(Object[] values) {
  ....
  for (int i = 0; i < values.length; i++) {
    if (values[i] == null) continue;
    if (values[i] instanceof String) continue;
    if (values[i] instanceof Text) continue;
    if (values[i] instanceof Long) continue;
    if (values[i] instanceof Integer) continue;
    if (values[i] instanceof Short) continue;
    if (values[i] instanceof Byte) continue;
    if (values[i] instanceof Double) continue;
    if (values[i] instanceof Float) continue;
    if (values[i] instanceof Boolean) continue; // <=
    if (values[i] instanceof Boolean) continue; // <=
    throw new IllegalArgumentException(....);
  }
  ....
}


The same condition is used twice in a row. Is the second condition superfluous or should another type be used instead of Boolean?

V6009: Function substring receives an odd arguments. The queryStringIndex + 1 argument should not be greater than queryStringLength.

LoggingAuditTrail.java(660):

LogEntryBuilder withRestUriAndMethod(RestRequest request) {
  final int queryStringIndex = request.uri().indexOf('?');
  int queryStringLength = request.uri().indexOf('#');
  if (queryStringLength < 0) {
    queryStringLength = request.uri().length();
  }
  if (queryStringIndex < 0) {
    logEntry.with(....);
  } else {
    logEntry.with(....);
  }
  if (queryStringIndex > -1) {
    logEntry.with(....,
                  request.uri().substring(queryStringIndex + 1, // <=
                                          queryStringLength)); // <=
  }
  ....
}


Let's consider the erroneous scenario which may cause the exception  StringIndexOutOfBoundsException. The exception will occur when request.uri() returns a string which contains the character  # before?. There are no checks in the method, so in case if it happens, the trouble is brewing. Perhaps, this will never happen due to various checks of the object outside the method, but setting hopes on this is not the best idea.

Conclusion

For many years, PVS-Studio has been helping to find defects in code of commercial and free open-source projects. Just recently, Java has joined the list of supported languages for analysis. Elasticsearch became one of the first tests for our newcomer. We hope that this check will be useful for the project and interesting for readers.

PVS-Studio for Java needs new challenges, new users, active feedback and clients in order to quickly adapt to the new world! So, I invite you to download it and try our analyzer on your work project right away!

Topics:
java ,elasticseach ,open source ,static code analysis ,static code analysis tool ,static code analyzer ,pvs-studio

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}