In 2012, the research firm Gartner defined the Logical Data Warehouse (LDW) as “a new data management architecture for analytics which combines the strengths of traditional repository warehouses with alternative data management and access strategy.”
The idea behind this new creation was that a data warehouse no longer had to be housed in one physical database. What drove this was the proliferation of new sources of data that had to be managed by traditional relational as well as emerging new non-relational data managers. The traditional data warehouse would require data be groomed, translated, and loaded. The LDW would access the alternative data managers with functions that fetch and transform data, in real time (or near to it), to read those alternative data structures, as needed.
Today’s new digital enterprises, however, are performing more and more analytics on live data rather than historical data. So the requirement for accessing disparate data sources is becoming important to operational data stores as well, not just data warehouses, or the new LDWs.
The advantage of moving your analytics to your operational data store is that transactional data is real time as required by time-sensitive business processes. Achieving this has been a challenge in the past, because software, hardware, and networks simply lacked the speed, scale, and reliability to query live data much less when it’s combined with secondary sources.
Today, many technical advances have made the “Operational LDW” practical, so we have seen more and more organizations embrace it. Indeed in 2014 Gartner coined a new term: Hybrid Transactional/Analytical Processing (HTAP). By performing analytics directly on Operational data, HTAP avoids the issue of data latency by eliminating the need to offload that data to data warehouses. HTAP can provide immediate business decision-making capabilities based on live data. Users can be informed of issues or trends in real time. Use cases include areas such as fraud detection, public safety, and inventory management.
PostgreSQL 9.6 Rocks
One key technology that can enable the Operational LDW and HTAP-like queries is PostgreSQL’s Foreign Data Wrappers (FDWs). Introduced in 2011, FDWs implement the SQL/MED (Management of External Data) interface, part of the SQL 2003 standard. Since then, we have seen FDWs developed for data sources ranging from relational databases, to files, to NoSQL. EnterpriseDB (EDB) has played a leading role in the collaborative development of “commercial grade” FDWs. EnterpriseDB recently led the design and implementation of FDWs for MongoDB, Hive/HDFS and MySQL, and fully supports the implementation of the postgres_fdw which allows access to foreign tables in other Postgres databases.
PostgreSQL 9.6 introduces a new level of sophistication to FDWs. The ability to push more predicates (Join, Sort, Update) to the remote server has been added to the FDW machinery, and the postgres_fdw has implemented them. This is the first step in being able to fully distribute query logic, leveraging the external servers’ capabilities, and minimizing network traffic.
PostgreSQL 9.6 also introduces Parallel Query capabilities. Parallel sequential scans, joins, and aggregates can provide performance improvements to large complex queries by distributing the query load to multiple cores. Again, engineers from EnterpriseDB led the design and implementation.
These and many more improvements in PostgreSQL 9.6 are documented here.
So as you see, new PostgreSQL use cases and technology advances have indeed made it “cool” to be an OLTP database again.