Skip to main content
Back to Blog

What Are DORA Metrics? A Practical Guide for Engineering Teams

March 9, 2026·12 min read·Gitmore Team

Your team ships fast. But is it shipping well? Deployment frequency alone doesn't answer that. Neither does change failure rate on its own. You need both speed and stability measured together to get the real picture.

That's what DORA metrics do. They're four (now five) metrics backed by a decade of research across 39,000+ professionals, and they're the closest thing the industry has to a standard for measuring software delivery performance.

This guide covers what each metric means, how to calculate them from your git data, the official benchmarks, common mistakes, and when DORA isn't enough on its own.


Where DORA Came From

DORA stands for DevOps Research and Assessment. It was co-founded in 2015 by Dr. Nicole Forsgren, Jez Humble, and Gene Kim. Their research started in 2014 with the annual State of DevOps Report (originally published with Puppet), and by 2018 they'd published Accelerate: The Science of Lean Software and DevOps, summarizing four years of findings.

Google Cloud acquired DORA in December 2018. The research has continued every year since, and the 2024 report marks the 10th anniversary with data from over 39,000 professionals globally. The book won the Shingo Research and Professional Publication Award in 2019.

One of DORA's core findings: speed and stability are not tradeoffs. Top-performing teams score well on all metrics. Low performers score poorly across the board. You don't have to choose between moving fast and keeping things stable.


The Four DORA Metrics (Plus the Fifth)

DORA groups its metrics into two dimensions: throughput (how fast you deliver) and stability (how reliable those deliveries are).

Throughput Metrics

1. Deployment Frequency (DF)

How often your team deploys code to production. This isn't commits or merges. It's production deployments specifically.

Example: A team deploying 5 times per day vs. once per month. Higher frequency usually means smaller changes, which are easier to debug when something breaks.

2. Change Lead Time (CLT)

The time from a developer's first commit on a branch to that code running in production. This covers coding, review, testing, and deployment.

Example: A developer merges a PR at 9am and it reaches production by 11am = 2-hour lead time. If it takes 3 weeks, something in the pipeline needs attention.

Stability Metrics

3. Change Failure Rate (CFR)

The percentage of deployments that require an immediate rollback, hotfix, or patch. Not every bug counts. Only failures that need urgent intervention.

Example: 3 out of 30 deployments this month caused incidents = 10% CFR.

4. Failed Deployment Recovery Time (FDRT)

How long it takes to recover from a failed deployment. Previously called "Mean Time to Restore" (MTTR), it was renamed in 2023 to focus specifically on software-change failures rather than external outages.

Example: A deployment causes an outage at 2pm and service is restored by 3pm = 1-hour recovery time.

5. Deployment Rework Rate (added 2024)

The ratio of unplanned deployments that happen because of a production incident. This was split from Change Failure Rate in the 2024 report to better capture rework as a separate signal.


Official Performance Benchmarks

DORA uses cluster analysis across thousands of survey respondents each year to define performance tiers. These numbers shift annually (the 2022 report only detected 3 clusters, no Elite tier), but here are the widely referenced benchmarks:

MetricEliteHighMediumLow
Deployment FrequencyMultiple per dayDaily to weeklyWeekly to monthlyWeekly to monthly
Change Lead Time< 1 day1 day to 1 week1 week to 1 month1 week to 1 month
Change Failure Rate5%10%15%64%
Recovery Time< 1 hour< 1 day1 day to 1 week1 month to 6 months

The gap between Medium and Low is dramatic. Low-performing teams have a 64% change failure rate vs. 5% for Elite. Their recovery time stretches from hours to months. These aren't small differences.


How to Calculate DORA Metrics from Git Data

You can derive DORA metrics from your existing git and CI/CD data. Here's how each one maps to data you already have.

Deployment Frequency

Count the number of successful deployments to production in a given period. If you use CI/CD pipelines, count pipeline runs that deploy to production. A common git-based proxy: count merged PRs to your main branch that trigger a deployment pipeline.

deployments_this_month = count(successful_deploys_to_production)
frequency = deployments_this_month / days_in_month

Change Lead Time

Measure the time from the first commit on a branch (or PR creation) to when that code is deployed to production. You can pull this from the GitHub API using PR creation timestamps and deployment timestamps.

lead_time = deployment_timestamp - first_commit_timestamp
average_lead_time = sum(all_lead_times) / count(deployments)

Change Failure Rate

Divide the number of deployments that caused incidents by the total number of deployments. This requires tagging failed deployments in your CI/CD system or linking deployments to your incident tracking tool.

cfr = (failed_deployments / total_deployments) * 100

Recovery Time

Measure the elapsed time between a failed deployment and the next successful deployment on that service. You can also use incident creation to incident resolution timestamps from your incident management tool.

recovery_time = next_successful_deploy - failed_deploy_timestamp
average_recovery = sum(all_recovery_times) / count(failures)

Google's DORA team also maintains an open-source project called Four Keys that sets up a data ingestion pipeline from GitHub or GitLab through Google Cloud into a dashboard. It's a solid starting point if you want to build your own measurement system.


Why DORA Metrics Matter for Business Outcomes

DORA isn't just about engineering efficiency. The research found a direct link between software delivery performance and organizational performance.

  • Top performers are 2x more likely to meet or exceed their organizational goals, including profitability, market share, productivity, and customer count (source: Accelerate, based on 2014-2017 data)
  • Teams with superior documentation substantially outperformed peers in reliability and across all DORA metrics (2024 report)
  • Psychological safety and team autonomy emerged as strong predictors of software delivery performance (2024 report)
  • Organizations using balanced measurement strategies (DORA plus complementary metrics) reported 3-12% efficiency gains, 14% increases in R&D focus, and 15% improvements in developer engagement

10 Common Mistakes When Implementing DORA

1. Turning metrics into targets

Goodhart's Law: "When a measure becomes a target, it ceases to be a good measure." Saying "every team must deploy multiple times per day by Q4" incentivizes gaming, not improvement.

2. Gaming with trivial deployments

Pushing many small, inconsequential updates to inflate Deployment Frequency without delivering real value. The metric goes up but nothing improves.

3. Using one metric in isolation

Optimizing for deployment frequency while ignoring change failure rate means you're shipping bugs faster. The metrics are designed to work as a set.

4. Making unfair cross-team comparisons

Creating league tables that rank teams against each other ignores context. A web frontend team will naturally deploy more frequently than a mobile app team or a team working on embedded systems. Comparing them is misleading.

5. Rushing changes to improve Lead Time

Skipping code reviews or automated tests to get a lower lead time number produces buggy software. Speed at the expense of quality defeats the purpose.

6. Not measuring long enough before setting baselines

Teams need at least 2-3 months of data before deciding what "good" looks like for their specific context. A single sprint doesn't give you a reliable baseline.

7. Ignoring why metrics change

DORA shows what changed but not why. A spike in lead time might mean your pipeline is broken, or it might mean half the team was on holiday. External factors like absent code owners or company events skew results.

8. Applying DORA to teams with long release cycles

Teams that release monthly or quarterly will inherently show "low" deployment frequency. That doesn't mean they're performing poorly. The benchmarks assume continuous delivery pipelines.

9. Optimizing past diminishing returns

Going from 1 deployment per month to 1 per week is a big win. Going from 5 per day to 10 per day probably isn't. The framework doesn't tell you when to stop optimizing.

10. Not supplementing with other metrics

DORA measures pipeline performance but misses PR review time, first-response time on code reviews, work categorization (features vs. bugs vs. tech debt), and developer experience. You need additional signals to get the full picture.


What DORA Doesn't Measure (and Its Limitations)

DORA is useful but it has real blind spots. Knowing these prevents you from over-relying on four numbers to run your engineering org.

  • Value delivered: DORA tracks how fast code moves through a pipeline but not whether that code solves real customer problems
  • Invisible work: Mentoring, architecture decisions, technical debt reduction, and cross-team collaboration are not captured
  • Platform and infrastructure: Security, scalability, and maintainability aren't directly measured
  • Developer experience: How developers feel about their tools, processes, and work isn't part of DORA (that's where SPACE comes in)
  • Survey-based benchmarks: DORA's data comes from self-reported surveys, which introduces response bias. Teams may over- or under-report their actual performance

SPACE Framework: Filling the Gaps DORA Leaves Open

The SPACE framework was created in 2021 by Nicole Forsgren (yes, the same person who co-founded DORA), Margaret-Anne Storey, and colleagues at Microsoft Research. It was published in ACM Queue and expands the measurement picture across five dimensions:

DimensionWhat It MeasuresExample Metrics
Satisfaction & Well-beingDeveloper fulfillmentRetention rates, tool satisfaction surveys
PerformanceSystem and code outcomesDefect rates, reliability, feature usage
ActivityObservable work outputCommits, PRs, deployments, incidents handled
Communication & CollaborationHow well teams work togetherOnboarding speed, documentation quality, review participation
Efficiency & FlowWorkflow smoothnessInterruptions, handoffs, focus time, cycle time

DORA's metrics primarily cover the Activity and Performance dimensions. SPACE adds satisfaction, collaboration, and flow. The recommended approach is to start with DORA for delivery baselines, then layer in SPACE for a fuller picture of engineering health.

One practical guideline from the framework: pick at least 3 of the 5 dimensions, balance quantitative and survey data, and only report aggregated team-level results. Never use SPACE (or DORA) to evaluate individual developers.


Tools That Measure DORA Metrics

Several tools can automate DORA measurement. They differ primarily in how they define a "deployment" and how they link incidents to deployments.

ToolApproachBest For
Google Four KeysOpen-source, GitHub/GitLab to BigQueryTeams that want full control
GitLab (built-in)Native CI/CD pipeline analyticsGitLab-only teams
SleuthDeployment-centric event trackingAccuracy-focused teams
LinearBGit-centric with workflow automationTeams that also want PR automation
SwarmiaDORA + SPACE + developer experienceTeams that want DORA plus DX surveys
MiddlewareOpen-source DORA platformBudget-conscious teams on GitHub

For a broader comparison of tools that track git activity and team metrics, see our guide to git reporting tools.


How to Get Started with DORA

You don't need to buy an expensive platform to start measuring DORA. Here's a practical path:

  • Week 1-2: Pick one metric to start with. Deployment Frequency is the easiest because you can count merged PRs or CI/CD runs to production.
  • Week 3-4: Add Change Lead Time. Use PR creation and merge timestamps from your git platform. Most git providers expose this through their API.
  • Month 2: Add Change Failure Rate and Recovery Time. These require linking your deployments to your incident tracking (PagerDuty, Opsgenie, or even a spreadsheet).
  • Month 3+: Establish baselines. Resist comparing to the DORA benchmarks immediately. Your context matters more than industry averages.

The goal isn't to reach "Elite" on every metric. It's to understand where your bottlenecks are and improve over time. A team that moves from Low to Medium performance has made a more meaningful improvement than an Elite team optimizing by 5%.


Where Gitmore Fits In

Gitmore doesn't calculate DORA metrics directly. What it does is give you automated visibility into the daily git activity that feeds into those metrics: what got deployed, what PRs merged, what work categories the team focused on, and where things are stuck.

Think of it this way: DORA tells you your lead time is 5 days. Gitmore's daily reports tell you why it's 5 days, because you can see that PRs sit in review for 3 of those days. The metrics and the narrative work together.

Try Gitmore free to get AI-generated team reports from your GitHub, GitLab, or Bitbucket repos. Two-minute setup, no credit card.

Keep reading

Explore git reporting for your platform

Try Gitmore for free

Automated git reports for your engineering team. Set up in 2 minutes, no credit card required.

Get Started Free