Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

HttpURLConnection vs. the Principle of Least Astonishment

DZone's Guide to

HttpURLConnection vs. the Principle of Least Astonishment

If you're a developer, chances are you are at least familiar with the Principle of Least Astonishment, or PoLA. Check out how PoLA relates to HttpURLConnection in this fascinating debugging exploration.

· Integration Zone
Free Resource

Today’s data climate is fast-paced and it’s not slowing down. Here’s why your current integration solution is not enough. Brought to you in partnership with Liaison Technologies.

If you are a developer like me you probably heard of 'PoLA' or the Principle of Least Astonishment. The principle is fairly simple. Never surprise a user. Or in this case even more important, never surprise a developer. Unfortunately, I got more than I bargained for last week when one our web service clients were generating rogue requests.

Rogue requests you say? Yes, like in; we have no freaking idea where they're coming from. It's one of those moments where managers start running around in circles, panicking and yelling "we're hacked!!" or "please someone turn off Skynet!!"

Anyway, first a little bit of background. In our project, we have automatic activity logging which is triggered by aspects when a process starts. This includes the web service client in question and the processing on the endpoint as both of them are part of our system. So at some point, we noticed that before the response was sent by the endpoint, a second call from the same web service client came in. This was unexpected as the client is single threaded and no other clients were in the picture. Review and tests pointed out that it was simplyimpossible©® for our client to simultaneous generate another request while the first one was still in progress.

After a long day of debugging and going through too many logs, it turned out that the client was in fact disconnected before processing ended on the endpoint. So their requests weren't simultaneous after all. But why did that took us a day to find out? Did we play starcraft2 again instead of working?

Well no, it all started with the HTTP read timeout on the endpoint's container being unexpectedly set lower than we thought. The logging on the endpoint indicated that a reply was generated but the client was actually disconnected before that event because of the read timeout. This was of course not logged by our aspects on the endpoint side as this is decided on a lower level (the HTTP stack) rather than our endpoint itself.

Ok, true, I hear you say, but what about the web service client log? The web service client should have thrown a "ReadTimeoutException" or something similar and that should have been written to the log, right? Well, true, but it didn't. And now it comes, as it turned out the real surprises is inside HttpURLConnection (more specifically the default Oracle internal handler sun.net.www.protocol.http.HttpURLConnection)

Did you know that this default impl of HttpURLConnection has a special "feature" which does HTTP retries in "certain situations"? Yes? No? Well, I for once didn't. So what happened was that the timeout exception was indeed triggered by the web service client but silently caught by HttpURLConnection itself by which it decided to do an internal retry on its own. This means that the read() method called the web service on HttpURLConnection remains blocked like you are still waiting for the response to the first request. But internally HttpURLConnection is retrying the request more than once and thus generating multiple connections. This explained why it took us so long to discover this as the second call was never logged by our code as it is in fact never triggered by our code but by HttpURLConnection internally.

Here some code illustrating this:

import java.net.HttpURLConnection;
import java.net.InetSocketAddress;
import java.net.SocketTimeoutException;
import java.net.URL;
import java.util.concurrent.Executors;

import com.sun.net.httpserver.HttpServer;

/**
 * Created by koen on 30/01/16.
 */
public class TestMe {

 public static void main(String[] args) throws Exception {
  startHttpd();
  HttpURLConnection httpURLConnection = (HttpURLConnection) new URL("http://localhost:8080/").openConnection();

  if (!(httpURLConnection instanceof sun.net.www.protocol.http.HttpURLConnection)) {
   throw new IllegalStateException("Well it should really be sun.net.www.protocol.http.HttpURLConnection. "
     + "Check if no library registered it's impl using URL.setURLStreamHandlerFactory()");
  }
  httpURLConnection.setRequestMethod("POST");
  httpURLConnection.connect();
  System.out.println("Reading from stream...");
  httpURLConnection.getInputStream().read();
  System.out.println("Done");
 }

 public static void startHttpd() throws Exception {
  InetSocketAddress addr = new InetSocketAddress(8080);
  HttpServer server = HttpServer.create(addr, 0);

  server.createContext("/", httpExchange -> {
   System.out.println("------> Httpd got request. Request method was:" + httpExchange.getRequestMethod() + " Throwing timeout exception");
   if (true) {
    throw new SocketTimeoutException();
   }
  });
  server.setExecutor(Executors.newCachedThreadPool());
  server.start();
  System.out.println("Open for business.");
 }
}

If you run this, you'll get:

Open for business.
Reading from stream...
------> Httpd got request. Request method was:POST Throwing timeout exception
------> Httpd got request. Request method was:POST Throwing timeout exception
Exception in thread "main" java.net.SocketException: Unexpected end of file from server
 at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:792)
 ...

Notice that our httpd got two calls while we only did one? If we re-run this, but, this time, set the magic property-Dsun.net.http.retryPost=false we get:

------> Httpd got request. Request method was:POST Throwing timeout exception
Exception in thread "main" java.net.SocketException: Unexpected end of file from server
 at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:792)
 ...

Putting all this aside, who the hell builds a retry mechanism that isn't really documented nor configurable? Why am I after 15years of Java development (and a network fetish) not aware of this feature? But moreover, why the hell is it doing retries on a freaking HTTP freaking POST? PoLA breach detected!

As you probably guessed by now it's a bug (http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6382788). Not the retry mechanism, of course, that's just crap. The bug is that it also happens for a POST (which is by default not idempotent per HTTP RFC). But don't worry, Bill fixed that bug a long time ago. Bill fixed it by introducing a toggle. Bill learned about backward compatibility. Bill decided it's better to leave the toggle 'on' by default because that would make it bug-backward-compatible. Bill smiles. He can already see the faces of surprised developers around the globe running into this. Please don't be like Bill?

So after some exciting days of debugging the solution was kinda lame. Merely setting the property to false fixed it. Anyway, it surprised me enough to write a blog entry on it, so there you have it.

For the sake of completeness: if you run this code inside a container your results may vary. Depending on libraries used and/or your container, other implementations could have been registered which are then used rather than Oracle's internal one (see URL.setURLStreamHandlerFactory()). So now you might be thinking; why is that guy using the default HttpURLConnection then? Does he also drive to work in a wooden soapbox and cut's his grass with scissors? He could better start a fire and use smoke signals instead! Well, I can't blame you for thinking that, but we never deliberate decided on doings this. Our web service in question is a bit special and uses SAAJ SOAPConnectionFactory which on its turn uses HttpURLConnection, which reverts to the default implementation if no one registered another one.

If you use a more managed WS implementation (like Spring WS, CXF or JAX-WS impls) they will probably use something completely like Apache HTTP client. And of course, if you, yourself would make HTTP connections you would opt for the same. Yes, I'm promoting Apache commons HTTP client, that little critter which changes public API more often than an average fashionista changes shoes. But don't worry, I'll stop ranting now.

Is iPaaS solving the right problems? Not knowing the fundamental difference between iPaaS and iPaaS+ could cost you down the road. Brought to you in partnership with Liaison Technologies.

Topics:
java ,rest ,code

Published at DZone with permission of Koen Serneels, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}