You're doing DataOps wrong!
Join the DZone community and get the full member experience.Join For Free
We have discussed anti-patterns before on DZone, where we outlined eight major anti-patterns we find in Agile.
As a recap, an anti-pattern is a commonly-occurring approach to a problem or task that generates decidedly negative consequences. Examples of anti-patterns in the software development might be spaghetti code (code that becomes unmaintainable and difficult to extend); or duplicate code (code that has been cut and paste and now must be changed in two places).
You may also enjoy: Eleven Continuous Delivery Anti-Patterns
However, there is no way around it.
When it comes to managing your IT and test environments there are three ugly sisters: applications, infrastructure, and data.
And of these, data is truly the ugliest.
Ugliest, because it is invariably the most complex and least consistent operationally.
Ever notice the absence of data vendor logos on the DevOps diagrams? Perhaps it's time we took this ugly sister to the ball.
Let's talk about this “dark side” of DevOps and environment management.
Let’s talk DataOps.
What is DataOps?
DataOps is the combination of people, process, and products that enable consistent, automated, and secure management of data. Its goal is to improve enterprise IT delivery outcomes by bringing together those that consume the data with those that provision the data.
The benefits of DataOps spread across the enterprise. For example:
DataOps supports the software development lifecycle. DataOps will increase DevTest speed through rapid and consistent provisioning of environments for the development and test teams.
DataOps will enhance quality assurance. Through the provisioning of “production-like data” (or “well-formed” fabricated data) that allows testing to effectively exercise test case scenarios before errors are found by the customers.
DataOps will help you move safely to the cloud. DataOps can simplify and speed the process of migrating data to your clouds or other destinations.
DataOps supports data science and machine learning. Your data science and artificial intelligence efforts are only as good as the information available. DataOps helps ensure a reliable stream of good data for digestion and learning.
DataOps helps with compliance. DataOps helps establish standardized data security policies and controls to enable data to flow effectively and without risk to your customers.
Some Key DataOps Patterns and Anti-Patterns
Success Pattern: Treat data like you would code. Don’t exclude data because it's complex. Automate data tasks like data fabrication or ETL (Extract, Transform and Load) and attach them to your delivery chain.
Success Pattern: Use Masked “Production-Like” Data. Ensure developers and testers have a rich set of data to play with. The best source is always production itself. However, ensure the methods you use to Extract, Transform (Mask) and Load support privacy industry regulations.
Success Pattern: Mask or Encrypt. Use Masking or Encryption methods on vulnerable data, like data classified as Personally Identifiable Information (PII). Consider this as an opportunity to understand/recognize risks and consider re-architecting production data for the future.
Success Pattern: Refresh Data Continuously. Implement a regular automation refresh capability. Data is typically backed up daily (to SANs and/or a Failover site). Consider utilizing these copies as a way of obtaining good data without disrupting the production process.
Success Pattern. PMV Test Data (Profile, Mask and Validate). Introduce Data Compliance across your Non-Production Data. Consider risk profiling methods so you understand risks, remediation methods like masking and validation methods to ensure and prove your compliant end to end.
Anti-Pattern: Excluding data from the current DevOps (or CI/CD) methods. Meaning data is done in a bespoke/manual fashion that is inconsistent, untraceable, error-prone and slow.
Anti-Pattern: Provisioning of non “production-like” data. Fabricated (synthetic) data is good, and an excellent supplement for one’s data needs, particularly during early test phases like unit and system testing. However, trying to exercise systems thoroughly without “realistic” data or shape (relationships) is a sure way to restrict testing and miss important bugs.
Anti-Pattern: Migrating Data without encrypting (or masking). Resulting in data being vulnerable during transit and at the other end (at rest).
Anti-Pattern: Allowing Data to become stale. Old/dirty data will prevent your data scientists seeing the signals, prevent “AI learning” and restrict your projects ability to test effectively.
Anti-Pattern: Relying on information security to only be done at the production permitter. Invariably meaning data across your non-production environments, which often contains production copies, is left unprotected and vulnerable to theft.
* Remember 95% of an IT Projects Time and Staff is inside the Non-Production space. Leaving the data unprotected is a sure way of having Customer PII (Personally Identifiable Information) getting into the wrong hands.
If you want to be agile, compliant, or successful in managing your IT environments and delivering change, then there are no two ways about it, you need to effectively employ DataOps. The alternative is overly-manual, error-prone and non-repeatable methods that result in IT project delays, late time to market, security exposures and poor quality. Sound familiar?
Learn More or Share Ideas
If you’d like to learn more about DataOps, or perhaps just share your own ideas then about anti-patterns then feel free to contact myself or the enov8 team. Enov8 provides a set of solutions (Environments, Release and Data) to help companies apply DevOps at scale.
Opinions expressed by DZone contributors are their own.