Briefly about Elasticsearch

The whole story of how things were

Downloaded Elasticsearch from GitHub;

Followed the instructions how to run the Java analyzer and ran the analysis;

Received the analyzer's report, delved into it and pointed out interesting cases.

Watch out! Possible NullPointerException

private static PathTrie<RequestHandler> defaultHandlers(....) { .... handlers.insert("POST /batch/storage/v1", (request) -> { .... // Reads the body line = reader.readLine(); byte[] batchedBody = new byte[0]; if ((line != null) || (line.startsWith("--" + boundary) == false)) // <= { batchedBody = line.getBytes(StandardCharsets.UTF_8); } .... }); .... }

void start( ResumeFollowAction.Request request, String clusterNameAlias, IndexMetaData leaderIndexMetadata, IndexMetaData followIndexMetadata, ....) throws IOException { MapperService mapperService = followIndexMetadata != null // <= ? .... : null; validate(request, leaderIndexMetadata, followIndexMetadata, // <= leaderIndexHistoryUUIDs, mapperService); .... }

static void validate( final ResumeFollowAction.Request request, final IndexMetaData leaderIndex, final IndexMetaData followIndex, // <= ....) { .... Map<String, String> ccrIndexMetadata = followIndex.getCustomData(....); // <= if (ccrIndexMetadata == null) { throw new IllegalArgumentException(....); } .... }}

private void buildRow(Table table, boolean fullId, boolean detailed, DiscoveryNodes discoveryNodes, TaskInfo taskInfo) { .... DiscoveryNode node = discoveryNodes.get(nodeId); .... // Node information. Note that the node may be null because it has // left the cluster between when we got this response and now. table.addCell(fullId ? nodeId : Strings.substring(nodeId, 0, 4)); table.addCell(node == null ? "-" : node.getHostAddress()); table.addCell(node.getAddress().address().getPort()); table.addCell(node == null ? "-" : node.getName()); table.addCell(node == null ? "-" : node.getVersion().toString()); .... }

private void printStackTrace(Consumer<String> consumer) { Throwable originalCause = getCause(); Throwable cause = originalCause; if (cause instanceof CreationException) { cause = getFirstGuiceCause((CreationException)cause); } String message = cause.toString(); // <= consumer.accept(message); if (cause != null) { // <= // walk to the root cause while (cause.getCause() != null) { cause = cause.getCause(); } .... } .... }

Meaningless conditions

private static int findNextWhiteSpace(int i, String s) { for (; i < s.length() && (s.charAt(i) != ' ' || s.charAt(i) != '\t'); i++) { // intentionally empty } return i; }

private static int skipWhiteSpace(int i, String s) { for (; i < s.length() && (s.charAt(i) == ' ' || s.charAt(i) == '\t'); i++) { // intentionally empty } return i; }

private static byte[] generateOpenSslKey(char[] password, byte[] salt, int keyLength) { .... int copied = 0; int remaining; while (copied < keyLength) { remaining = keyLength - copied; .... copied += bytesToCopy; if (remaining == 0) { // <= break; } .... } .... }

ActiveDirectorySessionFactory(RealmConfig config, SSLService sslService, ThreadPool threadPool) throws LDAPException { super(...., () -> { if (....) { final String healthCheckDn = ....; if (healthCheckDn.isEmpty() && healthCheckDn.indexOf('=') > 0) { return healthCheckDn; } } return ....; }, ....); .... }

Little method can go a long way

private static byte char64(char x) { if ((int)x < 0 || (int)x > index_64.length) return -1; return index_64[(int)x]; }

V6007 Expression '(int)x < 0' is always false. BCrypt.java(429) V6025 Possibly index '(int) x' is out of bounds. BCrypt.java(431)

private static final byte index_64[] = { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, 1, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, -1, -1, -1, -1, -1, -1, -1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, -1, -1, -1, -1, -1, -1, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, -1, -1, -1, -1, -1 };

Comparisons, which did their best

.... private final String table; private final String name; private final String esType; private final Integer displaySize; .... @Override public boolean equals(Object o) { if (this == o) { return true; } if (o == null || getClass() != o.getClass()) { return false; } ColumnInfo that = (ColumnInfo) o; return displaySize == that.displaySize && // <= Objects.equals(table, that.table) && Objects.equals(name, that.name) && Objects.equals(esType, that.esType); }

.... private final TimeValue queryDelay; private final TimeValue frequency; .... private final Integer scrollSize; .... boolean isNoop(DatafeedConfig datafeed) { return (frequency == null || Objects.equals(frequency, datafeed.getFrequency())) && (queryDelay == null || Objects.equals(queryDelay, datafeed.getQueryDelay())) && (scrollSize == null || Objects.equals(scrollSize, datafeed.getQueryDelay())) // <= && ....) }

@Override public boolean equals(Object obj) { .... return index.equals(other.index) && type.equals(other.type) && Objects.equals(id, other.id) && docVersion == other.docVersion && found == other.found && tookInMillis == tookInMillis // <= && Objects.equals(termVectorList, other.termVectorList); }

@Override public boolean equals(Object o) { .... return shardId.id() == that.shardId.id() && shardId.getIndexName().equals(shardId.getIndexName()) && // <= Objects.equals(reason, that.reason) && Objects.equals(nodeId, that.nodeId) && status.getStatus() == that.status.getStatus(); }

Miscellaneous

@Override public void setAutoCommit(boolean autoCommit) throws SQLException { checkOpen(); if (!autoCommit) { new SQLFeatureNotSupportedException(....); } }

@Override public <T> T compile(....) { .... if (context.instanceClazz.equals(FieldScript.class)) { .... } else if (context.instanceClazz.equals(FieldScript.class)) { .... } else if(context.instanceClazz.equals(TermsSetQueryScript.class)) { .... } else if (context.instanceClazz.equals(NumberSortScript.class)) .... }

public SearchAfterBuilder setSortValues(Object[] values) { .... for (int i = 0; i < values.length; i++) { if (values[i] == null) continue; if (values[i] instanceof String) continue; if (values[i] instanceof Text) continue; if (values[i] instanceof Long) continue; if (values[i] instanceof Integer) continue; if (values[i] instanceof Short) continue; if (values[i] instanceof Byte) continue; if (values[i] instanceof Double) continue; if (values[i] instanceof Float) continue; if (values[i] instanceof Boolean) continue; // <= if (values[i] instanceof Boolean) continue; // <= throw new IllegalArgumentException(....); } .... }

LogEntryBuilder withRestUriAndMethod(RestRequest request) { final int queryStringIndex = request.uri().indexOf('?'); int queryStringLength = request.uri().indexOf('#'); if (queryStringLength < 0) { queryStringLength = request.uri().length(); } if (queryStringIndex < 0) { logEntry.with(....); } else { logEntry.with(....); } if (queryStringIndex > -1) { logEntry.with(...., request.uri().substring(queryStringIndex + 1,// <= queryStringLength)); // <= } .... }

Conclusion

The PVS-Studio team has been keeping the blog about the checks of open-source projects by the same-name static code analyzer for many years. To date, more than 300 projects have been checked, the base of errors contains more than 12000 cases. Initially the analyzer was implemented for checking C and C++ code, support of C# was added later. Therefore, from all checked projects the majority (> 80%) accounts for C and C++. Quite recently Java was added to the list of supported languages, which means that there is now a whole new open world for PVS-Studio, so it's time to complement the base with errors from Java projects.The Java world is vast and varied, so one doesn't even know where to look first when choosing a project to test the new analyzer. Ultimately, the choice fell on the full-text search and analytical engine Elasticsearch. It is quite a successful project, and it's even especially pleasant to find errors in significant projects. So, what defects did PVS-Studio for Java manage to detect? Further talk will be right about the results of the check. Elasticsearch is a distributed, RESTful search and analytics engine with open source code, capable of solving a growing number of use cases. It enables you to store large amounts of data, carry out a quick search and analytics (almost in real time mode). Typically, it is used as the underlying mechanism/technology, which provides applications with complex functions and search requirements.Among the major sites using Elasticsearch there are Wikimedia, StumbleUpon, Quora, Foursquare, SoundCloud, GitHub, Netflix, Amazon, IBM, Qbox.Fine, enough of introduction.There were no problems with the check itself. The sequence of actions is rather simple and didn't take much time:Now let's move on to the main point. V6008 Null dereference of 'line'. GoogleCloudStorageFixture.java(451)The error in this code fragment is that if the string from the buffer wasn't read, the call of themethod in the condition of thestatement will result in throwing theexception. Most likely, this is a typo and when writing a condition developers meant theoperator instead of V6008 Potential null dereference of 'followIndexMetadata'. TransportResumeFollowAction.java(171), TransportResumeFollowAction.java(170), TransportResumeFollowAction.java(194)Another warning from the V6008 diagnostic. The objectkindled my interest. Themethod accepts several arguments as input, our suspect is right among them. After that, based on checking our object fora new object is created, which is involved in further method logic. Check forshows us thatcan still come from the outside as a null object. Well, let's look further.Then multiple arguments are pushed to themethod (again, there is our considered object among them). If we look at the implementation of the validation method, it all falls into place. Our potential null object is passed to themethod as a third argument, where it unconditionally gets dereferenced. Potentialas a result.We don't know for sure with what arguments themethod is called. It is quite possible that all arguments are checked somewhere before calling the method and no null object dereference will happen. Anyway, we should admit that such code implementation still looks unreliable and deserves attention. V6060 The 'node' reference was utilized before it was verified against null. RestTasksAction.java(152), RestTasksAction.java(151)Another diagnostic rule with the same problem triggered here.. The rule cries out for developers: «Guys, what are you doing? How could you do that? Oh, it is awful! Why do you first use the object and check if forin the next line? Here is how null object dereference happens. Alas, even a developer's comment didn't help. V6060 The 'cause' reference was utilized before it was verified against null. StartupException.java(76), StartupException.java(73)In this case we should take into account that themethod of theclass might return. The above problem repeats further, so its explanation is needless. V6007 Expression 's.charAt(i) != '\t'' is always true. Cron.java(1223)The considered function returns the index of the first space character, starting from theindex. What's wrong? We have the analyzer warning thatis always true, which means the expressionwill always be true as well. Is this true? I think, you can easily make sure of this, by substituting any character.As a result, this method will always return the index, equal to, which is wrong. I would venture to suggest that the above method is to blame here:Developers implemented the method, then copied and got our erroneous methodhaving made some edits. They kept fixing and fixing the method but haven't fixed it up. To get this right, one should use theoperator instead of V6007 Expression 'remaining == 0' is always false. PemUtils.java(439)From the condition of theloop we can note, thatwill always be less than. Hence, it is pointless to compare the remaining variable with 0, and it will always be false, at which point the loop won't exit by a condition. Remove the code or reconsider the behavior logic? I think, only developers will be able to put all the dots over the i. V6007 Expression 'healthCheckDn.indexOf('=') > 0' is always false. ActiveDirectorySessionFactory.java(73)Meaningless expression again. According to the condition, thestring has to be both empty and contain the character '=' not in the first position, so that lambda expression would return the string variable. Phew, that's all then! As you probably understood, it is impossible. We're not going to go deep in the code, let's leave it to developers' discretion.I cited only some of the erroneous examples, but beyond that there was plenty of the V6007 diagnostic triggerings, which should be regarded one by one, making relevant conclusions.So here we have a teeny-tiny method of several lines. But bugs are on the watch! Analysis of this method gave the following result:Issue N1. The expressionis always false (Yes, V6007 again). Thevariable cannot be negative, as it is of thetype. Thetype is an unsigned integer. It cannot be called a real error, but, nonetheless, the check is redundant and can be removed.Issue N2. Possible array index out of bounds, resulting in theexceptionThen it begs the question, which lies on the surface: „Wait, how about the index check?“So, we have a fixed-size array of 128 elements:When themethod receives thevariable, the index validity gets checked. Where is the flaw? Why is array index out of bounds still possible?The check (int)x > index_64.length is not quite correct. If the char64 method receives x with the value 128, the check won't protect against ArrayIndexOutOfBoundsException. Maybe this never happens in reality. However, the check is written incorrectly, and one has to change»greater than" operator (>) with «greater than or equal to (> =). V6013 Numbers 'displaySize' and 'that.displaySize' are compared by reference. Possibly an equality comparison was intended. ColumnInfo.java(122)What is incorrect here is thatobjects of thetype are compared using theoperator, that is, by reference. It's quite possible thatobjects, whosefields have different references but the same content, will be compared. In this case, comparison will give us the negative result, when we expected to get true.I would venture to guess that such a comparison could be the result of a failed refactoring and initially thefield was of thetype. V6058 The 'equals' function compares objects of incompatible types: Integer, TimeValue. DatafeedUpdate.java(375)Incorrect objects comparison again. This time objects with incompatible types are compared (and). The result of this comparison is obvious, and it is always false. You can see that class fields are compared with each other, only the names of the fields have to be changed. Here is the point — a developer decided to speed up the process by using the copy-paste and got a bug into the bargain. The class implements a getter for thefield, so to fix the error one should use the methodLet's look at a couple of erroneous examples without any explanation. Problems are quite obvious. V6001 There are identical sub-expressions 'tookInMillis' to the left and to the right of the '==' operator. TermVectorsResponse.java(152) V6009 Function 'equals' receives an odd argument. An object 'shardId.getIndexName()' is used as an argument to its own method. SnapshotShardFailure.java(208) V6006 The object was created but it is not being used. The 'throw' keyword could be missing. JdbcConnection.java(88)The bug is obvious and requires no explanation. A developer created an exception, but didn't throw it anywhere else. Such an anonymous exception is created successfully, as well as, most importantly, will be removed seamlessly. The reason is the lack of theoperator. V6003 The use of 'if (A) {....} else if (A) {....}' pattern was detected. There is a probability of logical error presence. MockScriptEngine.java(94), MockScriptEngine.java(105)In multipleone of the conditions is repeated twice, so the situation requires competent code review. V6039 There are two 'if' statements with identical conditional expressions. The first 'if' statement contains method return. This means that the second 'if' statement is senseless. SearchAfterBuilder.java(94), SearchAfterBuilder.java(93)The same condition is used twice in a row. Is the second condition superfluous or should another type be used instead of V6009 Function 'substring' receives an odd arguments. The 'queryStringIndex + 1' argument should not be greater than 'queryStringLength'. LoggingAuditTrail.java(660)Let's consider here the erroneous scenario which may cause the exception. The exception will occur whenreturns a string which contains the character '#' before '?'. There are no checks in the method, so in case if it happens, the trouble is brewing. Perhaps, this will never happen due to various checks of the object outside the method, but setting hopes on this is not the best idea.For many years PVS-Studio has been helping to find defects in code of commercial and free open-source projects. Just recently Java has joined the list of supported languages for analysis. Elasticsearch became one of the first tests for our newcomer. We hope that this check will be useful for the project and interesting for readers.PVS-Studio for Java needs new challenges, new users, active feedback and clients in order to quickly adapt to the new world :). So I invite you to download and try our analyzer on your work project right away!