Threads Stuck in java.net.SocketInputStream.socketRead0 API
Threads Stuck in java.net.SocketInputStream.socketRead0 API
Need help resolving stuck threads in the java.net.SocketInputStream.socketRead0 API?
Join the DZone community and get the full member experience.Join For Free
What does the
java.net.SocketInputStream.socketRead0() API do? Why is it showing up frequently in thread dumps? Why is it reported in thread dump analysis tools like fastThread.io? Is it something that I need to be concerned about? What are the potential solutions to this problem? Let’s find answers to these questions.
What Does the SocketInputStream.socketRead0() API Do?
It’s always easy to remember new concepts through real-life analogies. Suppose you are calling your wife or girlfriend on the phone. Once the call gets connected, if she is in a good mood, you will get the response: “Hello, Honey, how are you?” If your call got connected when she is in middle of doing work (say she is in her office, picking up kids, at the gym, etc.), there might be a delay in her response: “Hello, Honey...” Suppose your call got connected when she is upset. You might get a response after several seconds/minutes. So, the time you are waiting since the moment call got connected until the moment you hang up the call is basically the
socketRead0() API. (Thanks to Douglas Spath from IBM for giving this beautiful example to explain this SocketRead0() API.)
Your application might be interfacing with multiple remote applications through various protocol likes: SOAP, REST, HTTP, HTTPS, JDBC, RMI… all connections go through the JDK java.net layer to perform lower TCP-IP/Socket operations. In this layer, the
SocketInputStream.socketRead0() API is used to read and receive the data the remote application. Some remote applications may respond immediately, some might take time to respond, and some applications may not respond at all. Until your application reads the response data completely, your application thread will be stuck in this
Sample Thread Dump Stacktrace
Below are some stacktrace examples that show the threads that are stuck in the
SocketInputStream.socketRead0 API. You can notice irrespective of the protocol threads to get stuck on the
"RMI TCP Connection(2)-192.xxx.xx.xx" daemon prio=6 tid=0x000000000a3e8800 nid=0x158e50 runnable [0x000000000adbe000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(Unknown Source) at java.net.SocketInputStream.read(Unknown Source) at java.io.BufferedInputStream.fill(Unknown Source) at java.io.BufferedInputStream.read(Unknown Source) - locked (0x00000007ad784010) (a java.io.BufferedInputStream) at java.io.FilterInputStream.read(Unknown Source) at sun.rmi.transport.tcp.TCPTransport.handleMessages(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(Unknown Source) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)
Fig: RMI thread stuck in
"Thread-18" id=48 idx=0x9c tid=11696 prio=5 alive, in native, daemon at jrockit/net/SocketNativeIO.readBytesPinned(Ljava/io/FileDescriptor;[BIII)I(Native Method) at jrockit/net/SocketNativeIO.socketRead(SocketNativeIO.java:32) at java/net/SocketInputStream.socketRead0(Ljava/io/FileDescriptor;[BIII)I(SocketInputStream.java) at java/net/SocketInputStream.read(SocketInputStream.java:129) at java/net/ManagedSocketInputStreamHighPerformanceNew.read(ManagedSocketInputStreamHighPerformanceNew.java:100) at java/net/SocketInputStream.read(SocketInputStream.java:182) at java/net/ManagedSocketInputStreamHighPerformanceNew.read(ManagedSocketInputStreamHighPerformanceNew.java:55) at oracle/ons/InputBuffer.getNextString(InputBuffer.java:137) at oracle/ons/ReceiverThread.run(ReceiverThread.java:295) at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)
Fig: Oracle Database connection stuck in
"AMQP Connection 192.xx.xxx.xxx:5672" prio=5 RUNNABLE java.net.SocketInputStream.socketRead0(Native Method) java.net.SocketInputStream.socketRead(SocketInputStream.java:116) java.net.SocketInputStream.read(SocketInputStream.java:170) java.net.SocketInputStream.read(SocketInputStream.java:141) java.io.BufferedInputStream.fill(BufferedInputStream.java:246) java.io.BufferedInputStream.read(BufferedInputStream.java:265) java.io.DataInputStream.readUnsignedByte(DataInputStream.java:288) com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95) com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:139) com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:536) java.lang.Thread.run(Thread.java:745)
Fig: RabbitMQ stuck in
"Thread-2012" id=218 idx=0x09c tid=196 prio=10 alive, in native, daemon java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.read(SocketInputStream.java:140) at com.ibm.db2.jcc.t4.z.b(z.java:199) at com.ibm.db2.jcc.t4.z.c(z.java:289) at com.ibm.db2.jcc.t4.z.c(z.java:402) at com.ibm.db2.jcc.t4.z.v(z.java:1170) at com.ibm.db2.jcc.t4.cb.b(cb.java:40) at com.ibm.db2.jcc.t4.q.a(q.java:32) at com.ibm.db2.jcc.t4.sb.i(sb.java:135) at com.ibm.db2.jcc.am.yn.gb(yn.java:2066) at com.ibm.db2.jcc.am.zn.pc(zn.java:3446) at com.ibm.db2.jcc.am.zn.b(zn.java:4236) at com.ibm.db2.jcc.am.zn.fc(zn.java:2670) at com.ibm.db2.jcc.am.zn.execute(zn.java:2654) at com.ibm.ws.rsadapter.jdbc.WSJdbcPreparedStatement.execute(WSJdbcPreparedStatement.java:618) at com.mycompany.myapp.MyClass.executeDatabaseQuery(MyClass.java:123)
Fig: IBM DB2 statement execution stuck in
If you a thread gets stuck in the
SocketInputStream.socketRead0 API and doesn’t recover from it for a longer period, then the customer that originated the transaction will not see any response in their screen. It can puzzle and confuse the user. If multiple threads get stuck in the
SocketInputStream.socketRead0 API and doesn’t recover for a longer period, it can pose serious availability concerns to your application.
Here, we are outlining few potential solutions to address this problem:
1. Instrument timeout settings
1.1. JVM Network settings
1.4. Oracle JDBC
2. Validate Network connectivity
3. Work with remote application
4. Non-blocking HTTP client
1. Instrument Timeout Settings
Most applications don’t set appropriate timeout settings to recover from
SocketInputStream.socketRead0, thus they end up stuck in this API for a prolonged period. Setting the appropriate timeout is a great self-defensive mechanism that every application should do. Here are a few timeout settings you can apply to your application as you may see the fit:
1.1. JVM Network Settings
You can pass these two powerful timeout networking properties that can be globally applicable to all protocol handlers that uses java.net.URLConnection:
sun.net.client.defaultConnectTimeout specifies the timeout (in milliseconds) to establish the connection to the host. For example, for HTTP, connections it is the timeout when establishing the connection to the HTTP server. For FTP connections, it is the timeout when establishing the connection to FTP servers.
sun.net.client.defaultReadTimeout specifies the timeout (in milliseconds) when reading from input stream when a connection is established to a resource.
More details about JVM network settings can be found here.
If you are directly programming with Sockets, you may consider setting the timeout on the socket by invoking the setSoTimeout() API.
To this API, you can pass the timeout value in milliseconds. If the remote application doesn’t respond back within the specified timeout period, the
java.net.SocketTimeoutException will be thrown. This exception will free-up the thread, allowing it to work on other calls. Note: if timeout value is passed as 0, then it’s interpreted as an infinite timeout; it means the thread will never timeout.
If you are using JDBC (Java Database Connectivity) to connect, you may consider setting the timeout value using the setQueryTimeout() API.
This API will set the number of seconds the JDBC driver will wait to get the results from the database. If the limit is exceeded,
SQLTimeoutException is thrown. JDBC driver applies this limit to the execute,
executeUpdate() methods. By default, there is no limit on the amount of time allowed for a running statement to complete.
1.4. Oracle JDBC
If you are connecting with Oracle database and seeing a lot of threads stuck on
SocketInputStream.socketRead0() API, you may consider passing the Doracle.jdbc.ReadTimeoutsystem property.
You need to pass above argument during application startup. Value needs to be specified in milliseconds.
If your application happens to be running on IBM Websphere, you can consider setting following properties:
a. An administrator can set the webSphereDefaultQueryTimeout data source custom property.
b. A second property, syncQueryTimeoutWithTransactionTimeout, can also be set as a data source custom property. With this set, WebSphere will calculate the time remaining before the transaction times out (if running within a global transaction) and set the query timeout to this value automatically.
You can also set the “readTimeout” property in the HTTP Transport Policy Set for the Web Service client or set “timeout” on the org.apache.axis2.context.MessageContext in the application code.
2. Validate Network Connectivity
Threads not recovering from the
SocketInputStream.socketRead0 API can also originate because of issues in network connectivity or load balancers. We have seen in the past sometimes a remote application may not be issuing appropriate ACK or FIN packets. You might have to engage network engineers or cloud hosting providers support team to troubleshoot the issue.
On your end, you may use TCP/IP tracing tools, such as Wireshark, to see packets sent in the network between you and the remote application. It can help you to narrow whether if the problem is on your side of the network or on the other side of the network.
3. Work With Remote Application
Sometimes, it could be quite possible that transactions might be slowing down because of performance problems in the remote application. In those circumstances, you need to bring it to remote application’s awareness of the slow down and work with them to fix the problem.
4. Non-Blocking HTTP Client
You can also consider using non-blocking HTTP client libraries like Grizzly or Netty, which do not have blocking operations to hang a thread. But this solution is more a strategic solution, which involves code changes and thorough testing.
Note, this a comprehensive list but maybe not be a complete list of potential solutions. If you have additional solutions and timeout settings that you would like to add to this blog, please drop us a note in the below feedback section. We will be glad to update this blog with your recommendation(s).
Opinions expressed by DZone contributors are their own.