Asynchronous MySQL Queries with Non-Blocking Readiness Checks
Join the DZone community and get the full member experience.
Join For FreeWell, despite my best intentions, here I am again writing Ruby. I decided to automate a small part of some data analysis I’ve had to do a few times, starting with the database queries themselves. Unfortunately the data is spread over several hosts and databases and the first implementation simply queried them serially. The next iteration used the mysql2 gem‘s asynchronous query functionality but still naively blocked on the results retrieval rather than polling the IOs to see when they could be read from.
It doesn’t actually add anything to my script to do this, but it seemed like a small learning opportunity and somewhat interesting so here is the guts of that code:
# Set up the three different clients to different hosts, and a starting time client1 = Mysql2::Client.new() client2 = Mysql2::Client.new() client3 = Mysql2::Client.new() start_time = Time.now() # Run the first query query1 = "SELECT SLEEP(5);" puts "Running Query:\n >> #{query1}\n\n" client1.query(query1, :async => true) # Run the second query query2 = "SELECT SLEEP(10);" puts "Running Query:\n >> #{query2}\n\n" client2.query(query2, :async => true) # Run the third query query3 = "SELECT SLEEP(15);" puts "Running Query:\n >> #{query3}\n\n" client3.query(query3, :async => true) # Wait for results. Select on query IO sockets to rapidly determine # when the results are ready puts "Waiting for results..." query_ios = [IO::open(client1.socket), IO::open(client2.socket), IO::open(client3.socket)] orig_ios = query_ios.length received = 0 while true # select with a timeout of 1 second ready = IO::select(query_ios, nil, nil, 1) next if ready.nil? # next iteration if nothing is ready ready_reads = ready[0] # IO::select returns [[rd],[wr],[ex]] received += ready_reads.length print "\rReceived #{received}/#{orig_ios} results after #{(Time.now - start_time).round} seconds" if ready.length > 0 # Remove any IOs from the list to select from, if they are ready. query_ios -= ready_reads break if query_ios.length == 0 # Nothing left to select from end puts "" # Now all results are ready res1 = client1.async_results res2 = client2.async_results res3 = client3.async_results
The code is pretty simple and the comments should reveal the intent of any confusing lines. The only part that was slightly irritating was receiving file descriptor numbers from Mysql2::Client#socket rather than the IO itself, hence having to re-open the same file descriptor.
In this case I haven’t done anything fancy after checking when the results are ready, but you can see how this could be trivially turned into a system for querying multiple backends for the same data and returning the fastest result which is a quite popular pattern at the moment.
Published at DZone with permission of Oliver Hookins, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments