To gather insights on the state of databases today, and their future, we spoke to 27 executives at 24 companies who are involved in the creation and maintenance of databases.
- Emma McGrattan, S.V.P. of Engineering, Actian
- Zack Kendra, Principal Software Engineer, Blue Medora
- Subra Ramesh, VP of Products and Engineering, Dataguise
- Robert Reeves, Co-founder and CTO and Ben Gellar, VP of Marketing, Datical
- Peter Smails, VP of Marketing and Business Development and Shalabh Goyal, Director of Product, Datos IO
- Anders Wallgren, CTO and Avantika Mathur, Project Manager, Electric Cloud
- Lucas Vogel, Founder, Endpoint Systems
- Yu Xu, CEO, TigerGraphSQL
- Avinash Lakshman, CEO, Hedvig
- Matthias Funke, Director, Offering Manager, Hybrid Data Management, IBM
- Vicky Harp, Senior Product Manager, IDERA
- Ben Bromhead, CTO, Instaclustr
- Julie Lockner, Global Product Marketing, Data Platforms, InterSystems
- Amit Vij, CEO and Co-founder, Kinetica
- Anoop Dawar, V.P. Product Marketing and Management, MapR
- Shane Johnson, Senior Director of Product Marketing, MariaDB
- Derek Smith, CEO and Sean Cavanaugh, Director of Sales, Naveego
- Philip Rathle, V.P. Products, Neo4j
- Ariff Kassam, V.P. Products, NuoDB
- William Hardie, V.P. Oracle Database Product Management, Oracle
- Kate Duggan, Marketing Manager, Redgate Software Ltd.
- Syed Rasheed, Director Solutions Marketing Middleware Technologies, Red Hat
- John Hugg, Founding Engineer, VoltDB
- Milt Reder, V.P. of Engineering, Yet Analytics
Keys to a successful database strategy are:
- Having the right tool to solve the business problem at hand
- Performance and speed
- Data security
- The right data in the right format
Specific projects need specific resources. You need to know the business problem you are attempting to solve to ensure you are using the right database solution. Doing so will have a tremendous impact on the performance or speed of the database, especially with regards to large-scale data sets.
Security is now on everyone’s mind and is also a requirement for government, financial services, healthcare, and global retail customers given the regulatory compliance environment.
Automation and DevOps are required in order to accelerate the development cycle. DevOps processes require databases to be de-siloed so that everyone involved in data management and analytics has insight into the process. Lastly, data must be accessible but secure. That means knowing where data is and eliminating duplicate data that leads to significant storage expense. Enable data management to take advantage of the latest technology.
Companies in every vertical, especially those making a digital transformation, rely on databases to analyze and monetize data by reducing costs and identifying new revenue streams. They are critical to delivering immediate, personalized, data-driven applications and real-time analytics.
The cloud and containers, the transition from NoSQL back to SQL, and consolidation are the three biggest changes in the database ecosystem over the past year or so. Companies are moving from traditional on-premise databases to the cloud. The exploding volume of data is driving the adoption of cloud-native databases that will scale with the data. Cloud solutions are offering classical relational databases (for example, Spanner and Dynamo) that are sophisticated solutions. Containers are disruptive with a new set of requirements and challenges like persistence, provisioning, and storage.
NoSQL is not as pervasive. There’s been a shift away from NoSQL with Spanner, Cockroach, and NuoDB providing more consistent systems with SQL. SQL is coming back built on top of NoSQL.
The industry and technology are consolidating. People want the best of both worlds: distributed computing and SQL, with elastic SQL like Google Cloud Spanner that enables scalable apps to be built using SQL.
More than 60 different solutions were mentioned when we asked about the technical solutions they and their clients used for database projects. This affirms the breadth of platforms, architectures, languages, and tools that make up the ecosystem. The four most frequently mentioned were Spark, Hadoop, Cassandra, and Oracle.
Databases are widely adopted in every industry and are used for accounting, enterprise resource planning, supply chain management, human capital management, and basic data management. There are industry-specific solutions for multiple verticals that can process, scale, be available, and be secure. The three most frequently mentioned industries were financial services, healthcare, and retail. The first two are driven by the need to meet regulatory requirements while retailers are trying to serve customers their best product option, at the best price, in real-time.
Lack of knowledge, the complexity of the ecosystem, and scalability are the major hurdles for companies implementing successful database strategies. There are 50 to 100 different types of databases, and the customer needs to understand the benefits of each one. There’s a need to always put requirements first and not compromise critical business performance with speed, scale, accuracy, and cost. Those unfamiliar with databases can quickly find themselves in over their heads and unable to see the potential roadblocks of scalability, consistency, and availability. The cost of switching technology is high and evaluating fit can be challenging.
Data is exploding in volume, along with the number of solutions to deal with it. The more data that is stored is inversely proportional to a company’s ability to analyze the data. Companies have no clue how many copies of data they have, no idea of data lineage, and no idea of where all their data resides.
The biggest opportunities in the evolution of databases are with the integration and convergence of technologies in the cloud using microservices. As data grows, there will be greater adoption of modern platforms, with more functionality, and a more comprehensive data environment. More integrated platforms will be managed with APIs and offer both multi-dimensional and text analytics using artificial intelligence (AI), machine learning (ML), and predictive analytics. We will have polyglot or multi-model persistence to create a unified platform that performs hybrid transactional/analytical processing.
Distributed relational databases in the cloud are the future. The cloud provides a disposable infrastructure that lets the business owner focus on their core business. Microservices are becoming the norm and will change how systems are architected.
The biggest concerns around databases today is the persistence of legacy thinking and the lack of skills necessary to evaluate and manage new database solutions. Companies are still working on legacy idioms and data models, and cannot take advantage of parallelism. The fact that the database process has stayed the way it has is either negligence or criminal intent.
There’s a skills and knowledge gap that’s hard for companies to adapt because they do not have the specialists. The companies that do not carefully consider the new value and strengths of more established solutions when setting out to build something should be concerned. Users are sent on a wild goose chase because they don’t understand the right technology for the job — either due to a lack of knowledge or because the vendors are misleading them with regards to non-native graph databases.
Databases are now part of the DevOps fabric. It takes a different skill set to get business value from them these days. You have to be part programmer, part automator, part infrastructure administrator, and part database administrator. That’s a unicorn these days.
According to our respondents, while there are a lot of things for developers to know to be proficient with databases, only knowledge of SQL and databases were mentioned by more than a couple of respondents. You need to know the difference between joins because SQL is not going away. MongoDB, H-base Hadoop, and Cassandra all have SQL interfaces, and SQL is still the most well-known language. Understand how SQL impacts the performance of the data store, even though frameworks automate it.
Understand the different data models and the end goal so you can choose the right database for your use case. Look at the platform and tools and understand when to choose a relational (transactions) versus a NoSQL (reports and analytics) database.
Reinvent the art of evaluating technology. Everything changes so fast that the fundamentals you learned five or ten years ago may no longer apply. Something counter to your intuition could work because the underlying facts have changed. Modern networks have reduced latency from tens or hundreds of milliseconds to less than one millisecond. Bandwidth is 100-times greater than it was. Try to find the next set of standards.