3min

Mean time to recovery

In modem software products and services, which are rapidly changing complex systems, failure is inevitable, so the key question becomes: How quickly can service be restored? β€” Accelerate: The Science of Lean Software and DevOps. IT Revolution Press

Mean time to recovery, or MTTR, is calculated by adding up the total time spent on resolving an incident during any given period and then dividing that time by the number of incidents. This metric is most useful to measure how quickly your organization is able to resolve an incident. In other words, it can give you a sense of how resilient your system is.

Mean time to recovery view under vitals
Mean time to recovery view under vitals

The MTTR section in Vitals also provides the number of incidents on the given period and a breakdown per service having suffered an incident.

The Evolution graph shows a compareason between 3 main buckets:

  • Previous 3 months: the MTTR computed over the previous 3 months from the beginning of the selected period ; e.g selected period starts in the middle of June, then the months considered would be March, April, May.
  • Previous month: the MTTR computed over the last month from the beginning of the given period ; e.g selected period starts in the middle of June, then the month considered would be May.
  • Last week: the MTTR computed over the last week from the end of the selected period.

ο»Ώ

Updated 06 Jul 2022
Did this page help you?
Yes
No