Beyond DORA: Building a Holistic Framework for Engineering Metrics

Engineering teams face pressure to move faster. Learn why traditional metrics like velocity and story points can distort progress and hinder real results.

Dmitry Marinov

Sep. 10, 25 · Opinion

Likes (0)

Comment

Save

2.7K Views

My journey in the technical field has taken me from hands-on software engineering to the CTO’s role. In my monthly and quarterly routines at my current position, I regularly evaluate the efficiency of contributors: engineers, designers, QA, DevOps, and cross-functional teams overall. And over time, I’ve come to a clear conclusion: traditional engineering metrics like velocity, story points, or arguably lines of code can fail to capture the bigger picture. They are not inherently bad, but can drive wrong results, and their value depends entirely on how we use them.

These metrics only make sense when framed against real outcomes, such as customer value delivered, time-to-market improvements, system stability, or cost efficiency. They can be incredibly insightful when used tactically to diagnose patterns, identify bottlenecks, or track improvements within a team. However, as strategic indicators, they can often mislead and derail the process.

Take velocity, for instance. It can help forecast sprint scope or detect delivery issues. But the moment we use it as the only metric to judge productivity, we open the door to inflated estimates, reduced collaboration, and cargo cult Agile. I’ve seen teams very busy optimizing their charts while the product stagnates.

The same goes for lines of code. While useful in code churn analysis or detecting abnormal spikes, it says nothing about clarity, maintainability, or business impact. A brilliant solution in 10 lines is always more valuable than a workaround in 1,000.

We still need these metrics, but as instruments, not as compasses. The real compass is always the outcome and the questions we need to ask ourselves: Are we delivering value? Are we better/faster/safer/more cost-effective, etc., than we were a quarter ago? Metrics should help us ask better questions at the tactical level, not distract us from strategic goals.

OST Impact Model

The question of metrics that bring more value is complex. I believe no single framework can fit the task perfectly, but a thoughtful combination of them, applied in a real context, can bring more clarity and help achieve long-term results.

I do rely on DORA (DevOps Research and Assessment) metrics, but again, with some adjustments. DORA is a powerful and research-backed framework, especially for evaluating DevOps performance through metrics like Deployment Frequency and Lead Time for Changes. However, it’s narrow by design, highly effective for operational excellence, but not sufficient for capturing engineering impact at a broader product or business level.

For a more holistic view, I often supplement tactical metrics, such as DORA, with principles from the SPACE framework, specifically built around developers’/ team satisfaction and well-being, collaboration, and flow efficiency. These dimensions help balance technical metrics with human and systemic ones, surfacing early signs of burnout, silos, or friction.

Ultimately, I recommend looking at engineering performance on three levels. I call this the OST Impact Model, a framework that supports what I refer to as Outcome-Driven Development — engineering led by impact. These levels include:

1. Outcome level — customer impact, ROI, and time-to-market. Question to ask ourselves, “Are we delivering the things that drive business value?”

Within this level, we track metrics that reflect whether we are delivering the right things — those that drive business value, ROI, and predictability. To connect outcomes with execution, tactical financial metrics should be tracked:

Total Story Points completed — as a measure of delivered output,
Story Point Cost Efficiency — comparing nominal and effective cost per SP,
ETA Accuracy — tracking how close our delivery is to the original estimates,
Cost per Story Point — to evaluate if we’re shipping value in a financially sustainable way,
Total Task Cost — the real cost of engineering work tied to initiatives,
Salary Paid vs. Output — monitoring whether the team's spending matches the delivery impact,
Dev/QA/Analyst Cost Distribution (%) — ensuring cost is allocated effectively across functions.

2. System level — operational excellence, CI/CD performance, and release health. Question to ask ourselves, “Are we operating efficiently?”

Here, DORA fits perfectly alongside the architectural metrics and incident response data.

Velocity Trends — to detect delivery slowdowns, accelerations, or instability,
Tasks Completed — measuring throughput and pace of execution,
Calendar Time per Task — identifying delays, bottlenecks, and inefficiencies,
Hours per Story Point — to uncover hidden overhead or planning bias,
SP Estimate Accuracy — ensuring estimates are grounded in historical reality,
Average Task Cost — as a baseline for budgeting and project ROI analysis,
SP Dev Cost and Dev Cost per SP — indicators of engineering cost efficiency,
QA/Analyst/Review Total Costs — understanding how much we spend on quality and verification,
System Status Time Distribution — analysis of how long tasks spend in each state (e.g., blocked, in progress, ready for release).
Team level — team’s sentiment, their satisfaction with the job, and overall condition. Question to ask ourselves, “Are our engineers set up for success?”

At this stage, SPACE, team health surveys, and engagement metrics come in.

Story Points per Assignee — evaluating individual contribution patterns,
Tasks per Assignee — measuring workload balance across the team,
Working Hours per Task — highlighting cognitive load or potential burnout,
Average Calendar Time (TTM) — helping surface dependencies or systemic delays,
Task Type Breakdown (e.g., bugs, tech debt, planning) — understanding where effort and budget are going,
SP Cost per Role (Dev, QA, Analyst) — assessing cost-efficiency at the role level,
Top Contributor Analysis — identifying both overdependence and underutilization,
Team Time in Workflow States — spotting where collaboration breaks down,
SP Estimate Accuracy — ensuring estimates are grounded in historical reality.

It’s important to remember that no metric matters in isolation. The real value comes when you link tactical metrics like DORA to strategic outcomes. This creates a feedback loop between engineering work and business impact. I believe that’s where true engineering leadership lives.

The Most Common Issues: Fixing ETA Failures

Speaking from experience, one of the most persistent issues I've faced is the problem of missed ETAs (Estimated Time of Arrival). Teams consistently underestimated delivery timelines by 30-60%, breaking trust and crushing morale.

To address this, I usually implement a diagnostic approach that many in the Agile community would call an “anti-pattern”. In one case, I conducted a short-term analysis to calculate a baseline hours-per-story-point ratio for my engineers. This was not about micromanagement, but rather about establishing a reality-based conversion factor. When a team provided an estimate in points, I could apply this factor to produce a more predictable timeline.

The stability allowed us to diagnose the root cause: crippling cognitive load. The estimation problem was merely a symptom. Using the newfound predictability, we were able to refactor the organization, aligning teams with smaller, more focused "team-sized" software boundaries. The results were immediate — we moved into a delivery window of 80-120% ETA accuracy.

In Conclusion: Metrics Are Tools, Not Truths

Engineering performance can’t be captured by a single number. Metrics are not answers — they are questions in disguise. The goal is not to find the perfect metric, but to ask better questions.

Instead of chasing clean dashboards, focus on clarity. Use frameworks like OST to connect what your team does (systems), how they work (team health), and why it matters (outcomes). Learn from your mistakes and iterate. Let engineering performance become a dialogue, not a verdict. Listen to your team. Create space for pushback and insights from the ground up. Argue, but with data. Disagree, but with context.

You won’t always be right, and your team won’t always agree with you. That’s fine. At the C-level, consensus isn’t the goal, but clarity is. Use frameworks like OST to align outcomes, systems, and teams, and you’ll not only modernize your performance assessment — you’ll evolve how your organization thinks about engineering itself.

Don’t chase perfect metrics — chase better questions.

Engineering agile Performance career

Opinions expressed by DZone contributors are their own.

Related

Trending