DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
  1. DZone
  2. Data Engineering
  3. Data
  4. An Overview of Data Engineering for Product Experimentation

An Overview of Data Engineering for Product Experimentation

This article focuses on data engineering for supporting product experimentation which is rapidly developing to be a necessary core competency.

Karthik Sriram Chandrashekar user avatar by
Karthik Sriram Chandrashekar
·
Dec. 23, 22 · Review
Like (1)
Save
Tweet
Share
2.88K Views

Join the DZone community and get the full member experience.

Join For Free

Data engineering is a broad field and is often used as a catch-all term to signify a variety of different works. Anything that involves ingestion, storage, processing, or serving of data can constitute data engineering, and the nature of work also varies meaningfully based on the domain of the data. In this article, we focus specifically on data engineering for supporting product experimentation which is rapidly developing to be a necessary core competency for all organizations that aim to be data-driven.

Simply put, experimentation data engineering is the process of designing, building, and maintaining systems and infrastructure for collecting, storing, and analyzing data from experiments.

Broad Components of an Experimentation Platform.

Broad Components of an Experimentation Platform.

The image above details the high-level components that are part of any mature Experimentation Platform. Each of these components generates data that needs to be ingested and managed effectively by the experimentation data engineering function. 

What's Unique About Experimentation Data Engineering?

Experimentation Spans Multiple Domains 

Data engineering teams can do their best work when they understand the domain of their stakeholders and anticipate their needs effectively. 

In companies with a strong experimentation culture, experimentation is leveraged for all aspects of the business:

  1. Non-member / not-yet-customer conversion or acquisition experiments.
  2. Customer engagement and retention experiments.
  3. Algorithm experiments
  4. Outbound marketing experiments
  5. New partner or payment integration experiments.
  6. New business model experiments.

Each of these types of experiments has its own unique challenges since they are focused on very different domains with very different stakeholder sets. Further, the complexity and velocity of experimentation could vary significantly, requiring different operational support models. The excellent publication "Online Controlled Experiments and A/B Tests" gives an excellent overview of online experimentation for readers that are interested in diving deeper.

Experimentation Data Has a Variety of Functional Stakeholders

Further, experimentation data needs to support many different types of analyses aimed at different functions in the organization:

  • Reporting/Business Intelligence type Analyses: The ultimate goal of experiments is to understand the impact of some product or infrastructure change on some business KPI. This analysis is eventually consumed by business stakeholders like Product Managers and other executives.
  • Operational/Diagnostic Analyses:  Experiments, by definition, are new features driven by new code changes against a production "stable" experience. This means that experimental data can often be associated with bugs or other issues, which require an increased need for operational and diagnostic analyses to ensure the fidelity of the experiment. Further, the lifecycle of each experiment also needs to be maintained with appropriate metadata. These analyses are intended to be done by data scientists and engineers.
  • Scientific Analyses: Experiments are a method to perform causal inference on the effect of a change on a metric of interest. Causal inference is a scientific field of study that is increasingly becoming a high priority for organizations, much like Machine Learning is. For most basic experiments, while simple statistical hypothesis testing may be sufficient, increasingly, we are seeing the advent of complex techniques like CUPED and other model-driven causal effect estimation methods that need to be applied to experimental data. This requires a significantly higher level of data quality guarantees and further novel data system architectures to enable the computation of these novel statistics. Further, since this is an area of active research, experimentation data needs to be flexible enough to allow for a lot of ad-hoc analyses. The key stakeholders for this are actual scientists and statisticians.

Experimentation Data Requires a Platform-Thinking Mindset 

Given the variety of different stakeholders and use cases that experimentation data needs to support, to truly scale and enable organizations to become data-driven, experimentation data engineering teams need to think of themselves as creating a platform product, i.e., focus on the building blocks and capabilities that are core to any experimentation setting and enable the customers of the platform to mix and match and extend as necessary. 

Recommendations for Creating a Strong Experimentation Data Engineering Team

  • Focus on Self-Service and Enablement: Without this approach, experimentation data engineering teams will likely start drowning in support requests
  • Invest in foundational data quality tooling and processes: Errors or inconsistencies in the data can have significant impacts on the validity and reliability of the experiment results, and problems compound if not fixed early.
  • Build strong relationships on all sides: Software engineering teams that produce the data, data science teams that consume the data, and ultimately product and business teams that make decisions on the recommendations based on the data. Treat every one of these partners as equal stakeholders and build proactive relationships. Data engineering teams often treat only the Data Scientists as their stakeholders, which may not always be sufficient. 
  • Always think in terms of building blocks, reusability, and APIs.

Conclusion

The field of data engineering and the practice of experimentation as a technical capability are both rapidly evolving. It is clear that experimentation is a crucial aspect of data management for all organizations, along with business intelligence and reporting and machine learning. To this extent, we are also seeing a rapid increase in the number of companies being developed around providing easier experimentation capabilities as a service, and the concept of an experimentation platform is emerging as a core infrastructural component for technology companies.

Engineering Data (computing)

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Journey to Event Driven, Part 1: Why Event-First Programming Changes Everything
  • Microservices Testing
  • Top 10 Best Practices for Web Application Testing
  • GitLab vs Jenkins: Which Is the Best CI/CD Tool?

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: