Infinity Is a Bad Timeout

I believe this is one of those issues that looks tiny but causes a lot of problems in the real world. It can (and should) be solved by the library/client designers.

Many libraries wrap some external communication — be it a REST-like API, a message queue, a database, a mail server, or something else. Therefore, you have to have some timeout for connecting, for reading, writing, or idling. Sadly, many libraries have their default timeouts set to “0” or “-1” which means “infinity.”

And that is a useless and even harmful default. There isn’t a practical use case where you’d want to hang on forever waiting for a resource. And there are tons of situations where this can happen, i.e., the other end gets stuck. In the past three months, I had two libraries that have a default timeout of “infinity” and that eventually lead to production problems because we’ve forgotten to configure them properly. Sometimes, you even don’t see the problem until a thread pool gets exhausted.

So, I have a request to API/library designers (as I’ve done before – against property maps and encoding other than UTF-8). Never have “infinity” as a default timeout. Your library will thus cause lots of production issues. Also note that it’s sometimes an underlying HTTP client (or Socket) that doesn’t have a reasonable default — it’s still your job to fix that when wrapping it.

What default should you provide? Reasonable. Five seconds maybe? You may (rightly) say you don’t want to impose an arbitrary timeout on your users. In that case I have a better proposal:

Explicitly require a timeout for building your “client” (because these libraries are most often clients for some external system) — for example, Client.create(url, credentials, timeout) (and fail if no timeout is provided). That makes the users of the client actively consider what is a good timeout for their use case without imposing anything, and, most importantly, without risking stuck connections in production. Additionally, you can still present them with a “default” option, but still making them explicitly choose it. For example:

Client client = ClientBuilder.create(url)
// OR
Client client = ClientBuilder.create(url)

The builder above should require “timeouts” to be set and should fail if neither of the two methods was invoked. Even if you don’t provide these options, at least have a good way of specifying timeouts. Some libraries require reflection to set the timeout of their underlying client.

I believe this is one of those issues that looks tiny but causes a lot of problems in the real world. It can (and should) be solved by the library/client designers. But since it isn’t always the case, we must make sure that timeouts are configured every time we use a third-party library.

