An Introduction to CQRS
CQRS has been around for a long time, but if you're not familiar with it, it's new to you. Take a look at a quick introduction to what it is and how it works.
Join the DZone community and get the full member experience.Join For Free
CQRS, Command Query Responsibility Segregation, is a method for optimizing writes to databases (queries) and reads from them (commands). Nowadays, many companies work with one large database. But these databases weren’t originally built to scale. When they were planned in the 1990s, there wasn’t so much data that needed to be consumed quickly.
In the age of Big Data, many databases can’t handle the growing number of complex reads and writes, resulting in errors, bottlenecks, and slow customer service. This is something DevOps engineers need to find a solution for.
Take a ride-sharing service like Uber or Lyft (just as an example; this is not an explanation of how these companies work). Traditionally (before CQRS), the client app queries the service database for drivers, whose profiles they see on their screens. At the same time, drivers can send commands to the service database to update their profile. The service database needs to be able to crosscheck queries for drivers, user locations, and driver commands about their profile and send the cache to client apps and drivers. These kinds of queries can put a strain on the database.
CQRS to the rescue. CQRS dictates the segregation of complex queries and commands. Instead of all queries and commands going to the database by the same code, the queries and data manipulation routines are simplified by adding a process in between for the data processing. This reduces stress on the database by enabling easier reading and writing cache data from the database.
A manifestation of such a segregation would be:
- Pushing the commands into a robust and scalable queue platform like Apache Kafka and storing the raw commands in a data warehouse software like AWS Redshift or HP Vertica.
- Stream processing with tools like Apache Flink or Apache Spark.
- Creating an aggregated cache table where data is queried from and displayed to the users.
The magic happens in the streaming part. Advanced calculations based on Google’s MapReduce technology enable quick and advanced distributed calculations across many machines. As a result, large amounts of data are quickly processed, and the right data gets to the right users, quickly and on time.
CQRS can be used by any service that is based on fast-growing data, whether user data (i.e. user profiles) or machine data (i.e. monitoring metrics). If you want to learn more about CQRS, check out this article.
Published at DZone with permission of Guy Arye, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.