At its European Universe conference last week, data warehousing giant Teradata unveiled a number of enhancements to its core data management offerings. Alongside announcements about the Teradata 15 database and a new version of its data warehousing appliance 6750 system, one announcement stood out for me: the launch of QueryGrid, a tool designed to orchestrate the execution of analytic processing across parallel databases.
To be a little more exact, QueryGrid provides a single execution layer that coordinates analytic processing across systems with a single SQL query without having to move the data, and is able to support multiple file systems and engines in the same workload. While that may sound a little ‘fluffy’ for some, what it actually does under the covers makes a lot of sense from an analytical querying perspective.
More to the point, QueryGrid is likely to come into its own in a Big Data world where architectures comprise multiply different platforms and workloads, whether it’s relational or Hadoop, structured or unstructured, real time or batch. While the company already advocates bringing together its different platforms from an architectural standpoint under its Unified Data Architecture (UDA), by melding together its enterprise data warehousing (EDW) platform Teradata, and its discovery platform Aster, with the poster child of Big Data – Hadoop – it didn’t necessarily have an offering specifically focused on meshing these components together from an analytical processing perspective. So this is where QueryGrid comes in.
But more than this, Teradata wants QueryGrid to act in much the same way as an orchestral conductor. Instead of unifying performers and controlling when different instruments play, it sees QueryGrid as a way to ensure Teradata, Aster and Hadoop all play and sit nicely together from a querying perspective. In essence QueryGrid makes it easier for developers to maintain a more cohesive querying strategy spanning different platforms, by reaching out and integrating with Aster functions such as SQL-MapReduce, graph, Teradata databases, Oracle, and other programming languages such as Perl, Python, R, and Ruby.
QueryGrid is most likely targeted at Teradata’s existing EDW customers – those that already have a big investment in SQL-based data warehousing but at the same time want to embrace and bring in other Big Data technologies into the fold, specifically Hadoop. The promise of QueryGrid is that it can utilise the SQL database skills in many large organisations by enabling them to harness existing investments to surface data without having to move it around (something that usually incurs extra cost and time).
In the longer term though, QueryGrid represents an effort by Teradata to shore up its Hadoop positioning. As we all know, Hadoop is garnering a lot of Big Data attention at the moment by virtue of its ability to store and process large volumes and varied types of data at a lower price point than many traditional data management platforms. For example, Hadoop-based analytic SQL engines such as Cloudera Impala offer a scalable MPP query engine architecture like that of Teradata, except it runs natively in open source Apache Hadoop and utilises cheaper commodity-based hardware.
While Hadoop clearly has a complementary role to play against data stored in a curated and consolidated data warehouse such as Teradata, it’s got the potential over the longer term to do much more. And this is why Teradata is keen to carve out is position in this evolving landscape. QueryGrid represents one way in which it can bridge the gap between the world of Teradata and Hadoop and subsequently capitalise on the potential of Big Data while limiting its impact on its core legacy business – data warehousing.