To gather insights on the state of databases today, and their future, we spoke to 27 executives at 23 companies who are involved in the creation and maintenance of databases.
We asked these executives, "What are the keys to a successful database strategy?" Here's what they told us:
- Specific projects need specific resources. The ability to bring different data feed together. Entity resolutions and differentiation. A database that can scale. Security is a core tenant for all government projects — row- and cell-level security. We’re targeting mission-critical databases.
- Know what business problem you are attempting to solve with the database since there are many different databases with specific applications. Provide data protection back-up and recovery for non-relational databases.
- Figure out the requirements with application and services on top of the database. Fit for purpose. Choose the right database tool for the right job. Many companies have several different types of databases technologies for particular use cases with a focus on speed, cost effectiveness, and scalability. Security decisions for the open-source database world are pushed back in the network to allow for network protection. Availability is key. Must be operating 24/7 serving customers around the world.
- Know what the database is going to be used for. There are a plethora of tools and technologies. Use MS SQL Server for transactional relational databases. Use NoSQL for analytics, business intelligence, and big data. Depends on the use case and the problem to solve.
- Speed, scale, and security are table-stakes. Now the emphasis is on building the customer powerful applications that provide a rich CX with real-time, intelligent augmentation driven by AI/ML. Be event driven. Design monolithic apps into microservices coupled through API interaction. Determine the right database solution needed to solve the problem. The "people" aspect is fundamentally different. No more classic DBA, storage admin, dev, and ops roles. Now there’s a more cohesive data ops team for communication and collaboration between data science, DBAs, data engineers, developers, and operations.
- Ensure we’re a good fit for the product and the company. Do you have the right partnership? Once we’re sure of the fit, we have a customer success checklist which reviews what the software can and cannot do, best uses, and potential hurdles. Cut off surprises up front. Simulate the system under strain in production as early as possible. Fault-tolerant distributed databases simulate types of failure in production. The more you can trial the workload you expect to run in preproduction, the better off you are. Work with the customer to determine how the product differs from others they have used — the patterns they are used to may not translate.
- What are the different classes of technologies and how do they fit in your enterprise? Operating systems with apps as a touchpoint used by employees, customers, partners, and machines. Decisioning products used by humans on the IT side making judgments about how the data will be used. Batch to real-time since data scientists have a longer Q&A period. Connectedness as a dimension — document database on the left, relational databases in the middle, and graph databases on the right with or without real-time connections.
- We believe in interoperability, portability, performance, and security. Interoperability because the number of customers for the data will become more diverse over time. This has given rise to microservices and APIs. Portability is important because there are a ton of legacy systems from which databases must be extricated. Performance and security are self-explanatory.
- Like many database concerns, the need for speed comes down to the application at hand. For instance, speed is paramount if you are responding directly to user queries, but might not be your most pressing need for long-running asynchronous analytics jobs. Do your best to understand the scaling properties of a solution before you commit to using it! Remember that most databases (and distributed configurations thereof) have differing read and write scaling and that you can use this to great advantage if you choose the right tool for the job. Also, remember the third (often hidden) scaling dimension of databases — developer and operations overhead. SQL systems offer rich and battle-tested security models, provided you use them correctly. On the other hand, many newer database paradigms don’t consider security and permissions to be a database concern, beyond the capability to encrypt data in transit and at rest. Be aware of this difference: The right tool for your job may require you to give more thought to security at the application and operational level.
- Availability is number one for 24/7 business operations. The database and the data must always be available. Plan for maintenance, as there is little tolerance for downtime. Second is scalability to meet the business needs with the new data sources, types of data, and volume of data. The third is data security and fourth is performance.
- Our customers care deeply about speed (initial data loading speed, real-time data update speed, and also query speed) pertaining to a large-scale dataset. Security of the database system itself is of less concern to most of our customers, mainly due to the fact they typically only expose applications/REST endpoints running on top of our graph database for production systems. In some use cases, customers use our graph engine to mine the connections in the data to make their business/systems more secure, but this is not related to database security itself.
Performance is user experience, so there’s no more important measure of your most critical databases. At the same time, performance is cost. And the cost can get away from you quickly — particularly in the cloud. It’s important that you can monitor both performance and underlying infrastructure side by side so that you can keep the ideal balance of both. Scalability is certainly behind a lot of the adoption of NoSQL and NewSQL platforms, but even with those architectures, it’s important to have a real-time understanding capacity and how much more your database can take. Security — human error and database drift can open security gaps with regards to configuration. Database monitoring systems that can notify users of incorrect security settings, missed scheduled backups, and more help to ensure the success of your database and the protection of your data.
- Standardize your data and databases as much as possible. Integrate databases into the DevOps process and do not allow databases to remain in a silo. Automate as much as possible. Product architecture plays a big part in how easy or hard it is to manage the database. Don’t use integers; they don’t work with clusters and services. Use GUIDs. Make your schemas additive. Databases do not like to delete data. Don’t consider MySQL to be an enterprise ready production database.
- When you are updating an app the schema or logic must change and that’s a manual process. You need to align the cadence and enforce safety, as well. Don’t break stuff. Recovery of a bad database push is expensive with backup and restore. We remove the need for recovery by catching problems in development and testing. Be able to scale for the team. We're currently using a manual process updating the database but it’s unable to keep up with accelerated processes and rates of change — an exponential nightmare.
- While speed, scale, and security are important, increasing speed to delivery is most important. Database automation accelerates the development cycle.
- Identify the location of all of the sensitive assets across all repositories. Protect those assets. Open access to the data with sensitive assets protected. Enable employees, partners, and customers to make data-driven decisions.
- Speed, scale, and security are all table-stakes. Understand what’s needed from a data management perspective. How does the application architect fit with the database architecture stack to deal with more tech systems running on how to manage availability?
- Understand use cases and requirements models and configure the data to meet the requirements. Have a strategy for modernizing legacy infrastructure to the cloud, virtual machines to containers, and proprietary to open source while meeting requirements. Keep up with changes so you know what database architecture to use.
- What are the company’s data needs? What are the applications for acting on data — gaining insights from operational data to enhance business transactions? Have a tiered data management strategy. Identify the long-term data that’s needed for compliance, reporting, and transactions. Have an active store memory. RAM should be a first-class citizen. Insights have a limited lifespan depending on the business and how they are using the data and there are more and different types of insights available with IoT, AI, and ML. You need to be able to keep as much data in memory as possible. Deploy in embedded mode (library) where the data grid is in the same JVM as the application. The application logic has instance access to the data resulting in virtually zero latency. The client-server mode will be cloud-based. This is great for keeping gaming and shopping cart data in memory with a scalable data architecture. Our customers use data grid on the cloud as a data state management store. Be able to recycle application nodes which for resilience.
- Mission-critical applications already have performance, scale, and security. Key capabilities are interoperability, embedded analytics, and the ability to add concurrent users without impacting current workloads. Look at the database and the data platform as a single stack.
- Regulatory compliance environment: PCI, HIPAA, and SOX can’t be formulated at the last minute. Complex process that requires planning. Speed, scale, and security are table stakes. Performance, scale, high availability, and disaster recovery are all needed to understand mean time expectations and test. Have a strategy around disaster recovery. Netflix uses chaos monkey. Know what is failover. Align database strategy with development strategy or release cadence of third-party apps. The code can be blown away and replaced but data must persist. Ensure expertise and planning of approval cadence. Ensure better availability of the development and testing system.
- Continually evolve the underlying technology to address technology and business challenges with a minimum of disruption. Now with the move to the cloud enabling data management to take advantage of the latest technology.
- The days of the standalone database strategy are long gone. Over the last few years, the speed of change and transformation, as well as the pervasive application and use of IT and data in all organizations, has been game-changing. That transformation’s impact on strategy has been profound. For example, it is increasingly challenging to plan long-term investments. Instead, businesses demand agility and constant cost optimization, both of which fuel cloud adoption, the convergence and simplification of data architectures, and more. With all this in mind, these are some guiding principles to consider with database management and strategy:
- Simplicity: Think about this when adopting, consuming, and deploying a database from both technical and non-technical standpoints.
- Convergence and simplification of data architectures: This is the ability to handle heterogeneous workloads with one database — on one copy of the data. (Example: HTAP TX and analytical workloads converged; or the ability to handle multiple data types, like JSON, XML, graph next to structured types in one repository.)
- Integration: This is the ability to integrate across multiple instances of the same database or a heterogeneous set of repositories to abstract the user and application from the underlying data architecture and virtualize access. (Example: Fluid Query and query federation built-into our offerings.) Write once, run anywhere — this is about building flexibility in deployment (on-premises, on IaaS, or get as a managed service) without changing your SQL code and having a consistent experience and working behavior when interacting with the database.
What are the key elements to a successful database strategy from your perspective?
Here’s who we talked to:
- Emma McGrattan, S.V.P. of Engineering, Actian
- Zack Kendra, Principal Software Engineer, Blue Medora
- Subra Ramesh, VP of Products and Engineering, Dataguise
- Robert Reeves, Co-founder and CTO and Ben Gellar, VP of Marketing, Datical
- Peter Smails, VP of Marketing and Business Development and Shalabh Goyal, Director of Product, Datos IO
- Anders Wallgren, CTO and Avantika Mathur, Project Manager, Electric Cloud
- Lucas Vogel, Founder, Endpoint Systems
- Yu Xu, CEO, GraphSQL
- Avinash Lakshman, CEO, Hedvig
- Matthias Funke, Director, Offering Manager, Hybrid Data Management, IBM
- Vicky Harp, Senior Product Manager, IDERA
- Ben Bromhead, CTO, Instaclustr
- Julie Lockner, Global Product Marketing, Data Platforms, InterSystems
- Amit Vij, CEO and Co-founder, Kinetica
- Anoop Dawar, V.P. Product Marketing and Management, MapR
- Shane Johnson, Senior Director of Product Marketing, MariaDB
- Derek Smith, CEO and Sean Cavanaugh, Director of Sales, Naveego
- Philip Rathle, V.P. Products, Neo4j
- Ariff Kassam, V.P. Products, NuoDB
- William Hardie, V.P. Oracle Database Product Management, Oracle
- Kate Duggan, Marketing Manager, Redgate Software Ltd.
- Syed Rasheed, Director Solutions Marketing Middleware Technologies, Red Hat
- John Hugg, Founding Engineer, VoltDB
- Milt Reder, V.P. of Engineering, Yet Analytics