Over a million developers have joined DZone.

Timeouts, TCP and Streaming Operations

· Performance Zone

Evolve your approach to Application Performance Monitoring by adopting five best practices that are outlined and explored in this e-book, brought to you in partnership with BMC.

We got a bug report in the RavenDB mailing list that was interesting to figure out.  The code in question was:

foreach(var product in GetAllProducts(session)) // GetAllProducts is implemented using streaming
{
  ++i;
  if (i > 1000)
  {
    i = 0;
    Thread.Sleep(1000);
  }
}

This code would cause a timeout error to occur after a while. The question is why? We can assume that this code is running in a console application, and it can take as long as it wants to process things.

And the server is not impacted from what the client is doing, so why do we have a timeout error here? The quick answer is that we are filling in the buffers.

GetAllProducts is using the RavenDB streaming API, which push the results of the query to the client as soon as we have anything. That lets us parallelize work on both server and client, and avoid having to hold everything in memory.

However, if the client isn’t processing things fast enough, we run into an interesting problem. The server is sending the data to the client over TCP. The client machine will get the results, buffer them and send them to the client. The client will read them from the TCP buffers, then do some work (in this case, just sleeping). Because the rate in which the client is processing items is much smaller than the rate in which we are sending them, the TCP buffers become full.

At this point, the client machine is going to start dropping TCP packets. It doesn’t have any more room to put the data in, and the server will send it again, anyway. And that is what the server is doing, assuming that we have a packet loss over the network. However, that will only hold up for a while, because if the client isn’t going to recover quickly, the server will decide that it is down, and close the TCP connection.

At this point, there isn’t any more data from the server, so the client will catch up with the buffered data, and then wait for the server to send more data. That isn’t going to happen, because the server already consider the connection lost. And eventually the client will time out with an error.

A streaming operation require us to process the results quickly enough to not jam the network.

RavenDB also have the notion of subscriptions. With those, we require explicit client confirmation from the client to send the next batch, so a a slow client isn’t going to cause issues.


Learn tips and best practices for optimizing your capacity management strategy with the Market Guide for Capacity Management, brought to you in partnership with BMC.

Topics:

Published at DZone with permission of Ayende Rahien, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}