Middleware can have a significant impact on the operation and performance of the database system. Pgpool-II is the middleware product that sits between the PostgreSQL server and database clients. It is developed and maintained by a committed open source Postgres community that includes EnterpriseDB employees.
A major version of pgpool-II is released every year. The latest is pgpool-II 3.6. Released in November 2016, pgpool-II 3.6 enhances failover capabilities, increases performance, and adds new abilities for setting configuration parameters. There were also a series of bug fixes, which are outlined on the pgpool-II Wiki here.
As a major contributor to the pgpool-II project, I encourage users to play around with the new functionality and provide feedback. In the meantime, I will explore here the most important new features in pgpool-II 3.6. For a complete list of the features and capabilities in pgpool-II 3.6, find the documentation here.
The feature that excites me the most in this release is the improvement to failover behavior. High availability is a core requirement for OLTP workloads. In fact, the need to maintain connectivity without disruption is perhaps the key requirement, as the ability to continually serve customers is crucial for most businesses. Pgpool-II 3.6 improves the pgpool-II user failover experience by making the failover behavior more robust and flexible.
show_nodes command will show the primary and the slave node with which the client is connected via pgpool-II. When a new client connects, pgpool-II establishes the connection with the primary node and one of the slave nodes using the "weighted random sampling method" approach. Depending on the load-balancing configuration, all write queries for the client are sent to the master node and read queries are load balanced to the slave node.
When a failover happens and one of the slave nodes dies due to a network or other issue, the failover process is triggered, causing all client sessions to disconnect even if the client wasn't connected to the slave nodes that passed out. Additionally, the health check timeout will try to reconnect to the failover node and new connections to pgpool-II won't be entertained until the health check is complete.
The enhancements to failover behavior in pgpool-II 3.6 address the drawbacks in the failover mechanism. Prior to the release of pgpool-II 3.6, users had to wrestle with all clients being disconnected during failover but new clients not connecting while the health check was in progress. Now, if the client is not connected to the slave node that died, the client connection will continue to work without disruption.
Again, as mentioned above, the pgpool-II
shows_nodes command can be used to see the primary and secondary nodes used by the current client session. The issue with the failed node can be addressed in parallel, and it can be added back to the pgpool-II cluster by using the PCP interface. Additionally, new clients will be permitted access rather than denying access until the health check operation is complete. Prior to these enhancements, the new client connection would be denied access until the health check retries are done.
Please note that this functionality is only available when the cluster is configured with streaming replication. The pgpool-II replication is not a highly used feature as most users are using Postgres streaming replication for replication across nodes. The failover enhancements should improve the experience for users that have deployed pgpool-II as a high availability solution. There will likely be additional improvements in this area in the next set of pgpool-II releases.
Performance is of paramount importance for middleware products as they support the complex components in the software stack and ensure they function properly. Because middleware means adding another layer between the client and the database server, the overhead must be minimal and performance must improve to keep up with other components in the stack. In the release prior to pgpool-II 3.6, major performance improvements were added for queries using the extended query protocol. This was addressed in the blog about the new features in the pgpool-II 3.5 release, A Little Respect for Pgpool.
Pgpool-II 3.6 also enhances performance significantly, this time addressing large data loads. Prior to 3.6 when pgpool-II flushes data to the network, it send
write() every time it sends a row data ("Data Row" message) to the frontend. For example, if 10,000 rows must be transferred, 10,000 write()s are issued. This is pretty expensive for the system. Since after repeating to send row data, "Command Complete" message is sent, it's enough to issue a
write() with the command complete message. With pgpool-II 3.6,
write() is only called once with command complete message. This eliminates the overhead of calling
write() for transferring every row and hence improves performance. Also, there is unnecessary flushing in handling the command complete message.
Testing showed performance increases of 47% to 62% were achieved in some cases. However, the change will not affect performance for workloads requiring the transfer of few rows because such rows must flush to the network anyway.
The ability to set pgpool-II configuration parameters on the fly was an important missing functionality. Postgres provides the
reset command that lets the user change the configuration parameters settings in the session; the
set command only lets you change the configuration parameters that don't require a server restart.
Similarly, pgpool-II 3.6 has added the
pgpool II_set, and
pgpool II_reset commands that let the user change the session level pgpool-II configuration parameters. The
pgpool II_show command will show the latest values for these parameters.
The following pgpool-II parameters can be changed using the
Pgpool-II 3.6 also takes care of a very important limitation of pgpool-II. Prior to v3.6, the
pg_terminate_backend(PID) function would trigger the failover operation even if this was not intended by the user. This happened because the server returned the same error message as it would when the postmaster died due to some reason. Pgpool-II would interpret that as a call for triggering the failover operation.
In order to address that, pgpool-II 3.6 now retrieves the PID of the backend intended to be terminated by
pg_terminate_backend call. The PID is then passed as a constant to
pg_terminate_backend() and it executes the
pg_terminate_backend command without triggering the failover operation.
Please note that the enhancements to
pg_terminate_backend handling only works for simple query protocol, it doesn’t work for prepared queries which uses the extended query protocol.
Like all the previous major releases of pgpool-II, the pgpool-II 3.6 release imports the parser changes of the latest PostgreSQL release. The parser changes for PostgreSQL 9.6 was imported as part of pgpool-II 3.6, including all the new syntax introduced,i.e.
OPY INSERT RETURNING, is also recognized by pgpool-II. If a particular syntax is not recognized by pgpool-II, it simply sends it to the master node instead of performing load balancing or any other operations.
What's Coming in Pgpool-II 3.7
It is too soon to identify the exact enhancements that will be part of the next major release of pgpool-II. But development will continue to focus on the stability of pgpool-II, and it will address any critical as well as non-critical bugs in the product. In terms of new functionality, we in the pgpool development community will look at making further improvements to the failover mechanism and supporting more authentication methods. For updates on pgpool-II 3.7 development as well as bug fixes, keep an eye on the community wiki set up for that purpose here.