Tech Talks With Tom Smith: DataOps Enables the Productization of Analytics

DZone 's Guide to

Tech Talks With Tom Smith: DataOps Enables the Productization of Analytics

Big data is going away; it's just data.

· Big Data Zone ·
Free Resource


DataOps: the future of Big Data

I had the opportunity to meet with Alex Gorelick, CEO and Founder, WaterLine Data at the DevOps Summit to get his thoughts on the current state of data and analytics.

What does DataOps mean from a technical perspective?

DataOps develops, productizes, and operationalizes data in an agile way with security and compliance.

What’s the state of the data and analytics landscape since we first spoke four years ago?

Governance has become more of an issue with GDPR, CCPA, and hundreds of data privacy regulations. The principles are the same: 1) practice data mindfulness, know what you are doing with your data; 2) know what data you have, where it resides, how it's used, and ensure people's data are protected; and, 3) maintain good data hygiene.

Poor quality data is bad for decision making and customers. Being responsible for data is good for business. Provide the opportunity for self-service analytics, access, and use of data. Establish common rules for best practices for data to use to avoid inadvertently doing wrong things.

You may also like: DataOps Anti-Patterns.

How can people get more value from the data they have?

People who have a problem don’t know how to get the data to solve their problem. What can I know by putting different data together — the democratization of data for business owners? People make decisions based on the limited data they have. Tag each field with a business term so when people search they can find the data regardless of what it’s called by using a catalog.

With data catalogs, you can search everywhere data resides and then bring the data you need to a place when someone needs it. Crawl all the data and use AI/ML and tag with business terms — looks at context and content. Help people build a catalog so they can shop within their own organization.

What are some use cases you're helping clients with today?

Self-service analytics. Data on sales — by geography, by segment, by time period. This used to take six months to find, access, when someone would be able to get to it was unknown. We enable clients to find the data they need without knowing the details of the data. We enable people to use business terms to find and request data. The process has gone from six months to days.

We help people join datasets with related data. We help you get the data you need so you can do something with it. It can be expensive to look at data and mask it – protect once someone requests data that contains PII.

As data volumes increase, it becomes hard to process and people begin throwing data away. We enable clients to keep data without investing too much to do so. By using a data catalog you know when data has been used, who requested it, and how it's been used.

How have data and analytics changed?

Big data is going away, it's just data. DataOps is replacing big data. Different uses with AI/ML and the democratization of data and analytics. Enterprises are getting their arms around their data and finding more value in it.

Successful companies are putting processes in place so the use of data is agile, autonomous, self-monitoring, and self-healing. This is necessary to survive in a data-driven economy. This results in more use of and applications of data. Solving more business problems like reduction of churn.

What do developers need to think about with regards to data and analytics today?

When you think of analytics, there’s a lot more experimentation with ML models than in developing applications. Failure is okay. You need to be able to experiment and learn. Focus on creating a minimal valuable product to show value. See if data solves the business problem before building it out. Be agile in working with data.

When you productize data, you have to monitor, check for drift, and adjust. Build a framework for doing so. ML models are trained like people if something changes, the models change. When things change ML models are going to be affected. Check for drift and have the infrastructure do it. When you productize and operationalize data, developers will need to create a framework for model drift as things changes to determine how well models are working. If analytics are the product, you will need DataOps to deliver.

Related Articles

big data, dataops, datasecops, devops, devsecops, tom smith

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}