To gather insights for DZone's Data Persistence Research Guide, scheduled for release in March, 2016, we spoke to 16 executives, from 13 companies, who develop databases and manage persistent data in their own company or help clients do so.
Here's who we talked to:
Satyen Sangani, CEO, Alation | Sam Rehman, CTO, Arxan | Andy Warfield, Co-Founder/CTO, Coho Data | Rami Chahine, V.P. Product Management and Dan Potter, CMO, Datawatch | Eric Frenkiel, Co-Founder/CEO, MemSQL | Will Shulman, CEO, MongoLab | Philip Rathle, V.P. of Product, Neo Technology | Paul Nashawaty, Product Marketing and Strategy, Progress | Joan Wrabetz, CTO, Qualisystems | Yiftach Shoolman, Co-Founder and CTO and Leena Joshi, V.P. Product Marketing, Redis Labs | Partha Seetala, CTO, Robin Systems | Dale Lutz, Co-Founder, and Paul Nalos, Database Team Lead, Safe Software | Jon Bock, VP of Product and Marketing, Snowflake Computing
As the number of databases grow, it’s important to understand the strengths and weaknesses of the different tools, and to choose the right database for what you’re trying to accomplish. More big data jobs are requiring a broader set of skills.
Here's what we heard when we asked, "What are the skills that make someone good at working with/administering database management systems?":
Proper design structure. Understand how the different data relates to each other. Know the desired result and know how to extract the information to provide the desired result.
It used to be just knowing the vendor tool. Today it’s the keen ability to understand the pluses and minuses or the strengths and weaknesses of the different tools and figuring out the right tools to use, and implementing them in the right way, to solve the problem. Be on top of the technologies that are out there. Be able to understand the pipeline and the flow of data. Know how to get data from the source, stage and clean it, and get it down stage in the form the users need it. Be able to deliver data in multiple forms without overwhelming complexity.
If you’re working with databases, know what’s inside what you’re working with—sets of tables like Excel spreadsheets. Know how the information is produced. Every database is different with regards to business content and structure. An administrator needs technical knowledge so they know how to work with the design of the system. Specialized knowledge is required in order to use it well.
If you are dealing with secondary data, you need to understand data science. On the front-end you need to understand the source of the data, how it operates, and the unique requirements of the use case or users. There’s a big difference between SQL/Oracle, Hadoop, and SAP Hana—they require three different skills sets. The architectures are all different. It’s very complex from one end of the spectrum to the other.
It's all about the data, the database is just a tool. Understand the architecture and the way you want to use the data. This goes beyond the talent of knowing the language of the database to understanding how to use the database for storage and access. Ownership of data has moved to the business-side with the advent of the data scientist.
For us, SQL is our interface so that’s all you need to know. SQL provides easy access via a language everyone knows.
It’s an interesting time. Five years ago you would do well training for SQL. Today it’s shifted to public cloud providers and building in AWS with the new databases that are available there. There are more big data jobs that require a broader set of skills.
Logical database administrators understand tables and syntax. System administrators understand how to map to physical infrastructure and governance. Good professionals have some level of expertise with two of these areas. Understand what data is and what it’s used for. Understanding all three levels makes you even more valuable. Know what data is regulated so that it can be secured and backed up. Have an appreciation of the data and how it will be used so that you can evaluate the need for real-time access versus less real-time access versus cost and security considerations.
The same set of skills programmers have. Logical, mathematically inclined, good at abstractions, manage complexity, STEM inclined. All application developers think about the database and the data model since a big part of app development is moving data from one part of the app to the other.
It depends on the layer of the database. Architects or database administrators understand the best way to store and retrieve data, they’re logical and have a physical view of the data and understand which part of the data is needed where. DevOps know how to scale outbound and inbound. Engineers know how to access data design to understand the application and relationship layer. Understand that data needs to be stored in a secure layer. Big data is an entirely different ballgame with Hadoop—knowing how to scrub, anonymize, and privatize the data are key issues. Do this mostly in the application layer.
Different roles require different skills. On our database (software development) team, creativity, problem solving, attention to detail, empathy for customers, and ability to work well together matter the most.
People familiar with data like data scientists and business analysts who want to spin up something quickly. Someone who knows how to manage the data—knows the data, the sources, and how it can be added to the container.
What else do you think one can do to improve their work with databases?