Skip to main content
Glossary

What Are Change Failure Rate?

A DORA metric that measures the percentage of production deployments that cause a failure requiring remediation (rollback, hotfix, or patch). Elite teams maintain a rate below 5%.

2-minute setup • No credit card required

What it means

Change failure rate (CFR) is calculated by dividing the number of deployments that result in a production incident, rollback, or hotfix by the total number of deployments in a given period. A 'failure' includes any deployment that degrades service for users — outages, performance regressions, broken features, or data corruption that requires immediate action. It does not include deployments that have minor bugs caught later in normal triage. CFR is the primary stability metric in the DORA framework and is inversely correlated with deployment frequency: teams that deploy smaller changes more often tend to have lower failure rates. The DORA benchmarks are: Elite (0-5%), High (5-10%), Medium (10-15%), Low (above 15%). CFR is typically tracked via incident management systems linked to deployment records.


Why Change Failure Rate matter

Change failure rate is the metric that proves you can move fast without breaking things. CTOs use it to justify investment in CI/CD and testing infrastructure: 'We deploy 20 times per week with a 3% failure rate' is a powerful statement about engineering maturity. For engineering managers, a rising CFR signals systemic issues — insufficient test coverage, rushed code reviews, inadequate staging environments, or too much complexity being shipped at once. Tracking CFR over time also helps measure the impact of quality initiatives: if you invest in integration tests and CFR drops from 12% to 4%, that's a concrete return on investment.


How to measure

CFR = (deployments causing failures / total deployments) x 100. Define 'failure' consistently: typically any deployment that triggers a rollback, hotfix, or Sev-1/Sev-2 incident within 24-48 hours. Pull deployment data from your CI/CD system and incident data from your alerting or incident management tool (PagerDuty, Opsgenie, Linear, Jira). Link incidents to the deployment that caused them using timestamps or deployment IDs. Calculate monthly or quarterly for reliable trends — weekly CFR is too noisy for most teams.


Real-world example

A B2B SaaS company tracks 80 production deployments in March. Of those, 7 caused incidents: 3 required hotfixes (broken API endpoints), 2 were rolled back (database migration issues), and 2 caused performance degradation requiring config changes. Their CFR is 7/80 = 8.75%, putting them in the 'High' DORA tier. Analysis shows all 5 of the serious failures (hotfixes + rollbacks) came from PRs with no integration tests. They mandate integration tests for all database and API changes, and April's CFR drops to 3.75%.

Related

Related terms

DORA metricsmean time to recoverydeployment frequencyrollbackincident managementtest coverageproduction stability
FAQ

Common questions

What counts as a 'failure' in change failure rate?

A failure is any deployment that requires immediate remediation: rollbacks, hotfixes, emergency patches, or configuration changes to restore service. Minor bugs that are fixed in the normal development cycle don't count. The key criterion is: did this deployment cause degraded service that required unplanned work to fix?

What is a good change failure rate?

DORA benchmarks: Elite teams have CFR below 5%, High performers between 5-10%, Medium between 10-15%, Low above 15%. Most teams should target below 10% as a first milestone. Getting below 5% typically requires strong automated testing, canary deployments, and feature flags.

How do you reduce change failure rate?

The most effective approaches: (1) Increase automated test coverage, especially integration tests for critical paths, (2) Use canary or blue-green deployments to catch issues before full rollout, (3) Ship smaller changes more frequently so failures are easier to isolate, (4) Implement feature flags to decouple deployment from release, (5) Add pre-deployment validation checks in your CI pipeline.

Can you have high deployment frequency and low change failure rate?

Yes — DORA research consistently shows they're positively correlated. Elite teams have both the highest deployment frequency AND the lowest change failure rate. This seems paradoxical but makes sense: smaller, more frequent deployments are easier to test, review, and roll back than large, infrequent releases.

Track Change Failure Rate Automatically

Gitmore turns your git activity into automated reports with real metrics — delivered to Slack and email.

Get Started Free

No credit card • No sales call • Reports in 2 minutes