DORA Metrics via Events API

Calculate best-practice engineering efficiency KPIs directly from your dev toolchain

The book Accelerate by Nicole Forsgren, Jez Humble & Gene Kim provides a de-facto standard on how to measure development efficiency, supported by clear scientific findings. It identifies four key metrics that support software delivery performance, namely:

  • Lead Time for Changes - how long does it take to go from code committed to code successfully running in production
  • Deployment Frequency - how often your software team deploy changes to production
  • Mean Time to Recovery - how long it takes to resolve or rollback an error in production
  • Change Failure Rate - what percentage of changes to production (software releases and configuration changes) fail

By visibility into these well-established metrics, the teams are enabled to balance speed metrics (Lead Time for Changes, Deployment Frequency) with stability metrics (Change Failure Rate, Mean Time to Recovery), avoiding that increased speed results in suffering quality.

Setup

Before you start, please enable the DORA Integration for VSM by opening Administration > Integrations and activating the DORA integration.

Activate the DORA IntegrationActivate the DORA Integration

Activate the DORA Integration

Events REST Endpoint

LeanIX VSM provides an /events endpoint for receiving events. The events follow the cloudevents format. Detailed description about the structure of the cloud events can be found in the cloud events specification.

📘

Supported Cloud event formats

Currently, only the binary-content-mode cloud event format is supported. Therefore, all event context attributes have to be mapped to HTTP headers with the same name as the attribute name but prefixed with ce e.g Time maps to ce-Time.

Software Artifacts are referenced via their VSM Event ID

The VSM Event ID is used to uniquely identify the Software Artifact referenced by an event. Hence, please store the unique identifier in the source system into this field (e.g., the GitHub repository name).
It can be found in the IDs section of the Software Artefact Fact Sheet like shown below, and can be populated both programmatically via our APIs or manually.

VSM Event ID for Software Artifact Fact SheetsVSM Event ID for Software Artifact Fact Sheets

VSM Event ID for Software Artifact Fact Sheets

Supported events types and format

Three type of events are currently supported and needed for generating the DORA metrics: change, release and incident events.
After the events are received they are getting augmented so that the four DORA metrics can be calculated. Please see DORA metrics calculation section for details.

📘

Events OpenAPI specification

An OpenAPI Specification of the Events API is also provided

Change event

The change event is about a committed change in a branch that is going to be released later. For some teams this is the main branch, for others this can be the release branch. The most important properties of the change event are the source, the time and the author.

🚧

We do not want to receive a change event for everything committed across all branches of a code repository. Ideally we should receive change events for commits included in a release event later on. Changes that are not associated with a release are going to be ignored for both the lead-time-for-changes and deployment-frequency metrics.

Properties

Property

Sample value

Description

SpecVersion

1.0

The version of the cloudEvents spec which the event uses

Type

net.leanix.valuestreams.change

Type of event

Id

6fc68dd0053c5e04bd29c322973a060a9d09d90a

The commit id of the change. The combination of id, source, and type shall be unique for each event

Source

order-history

Identifies the software artifact updated by this change event, referenced through the Fact Sheet's external ID VSM Event ID

Datacontenttype

application/json

Content type of the data value. For now only application/json types are accepted

Time

2021-11-18T15:13:39.4589254Z

Timestamp of the time when the code was committed to the repository

Data

{"author":"[email protected]"}

A json object with an author property containing the email of the release committer, and the timestamp of the time when the code was committed to the repository

Sample request

curl -v http://localhost:8080/events \
  -H "Ce-Specversion: 1.0" \
  -H "Ce-Type: net.leanix.valuestreams.change" \
  -H "Ce-Id: 6fc68dd0053c5e04bd29c322973a060a9d09d90a" \
  -H "Ce-Source: order-history" \
  -H "Ce-Time: 2021-11-18T15:13:39.4589254Z" \
  -H "Ce-Datacontenttype: application/json" \
  -H "Content-Type: application/json" \
  -d '{"author":"[email protected]"}'

Release event

A release event is sent whenever a release for a software artifact successfully gets deployed into production. The release contains a list of the all the committed changes.

Properties

Property

Sample value

Description

SpecVersion

1.0

The version of the cloudEvents spec which the event uses

Type

net.leanix.valuestreams.release

Type of event

Id

6fc68dd0053c5e04bd29c322973a060a9d09d90a

The id of the release event.
The combination of id, source, and type shall be unique for each event

Source

order-history

Identifies the software artifact deployed via this release event, referenced through the Fact Sheet's external ID VSM Event ID

Time

2021-11-18T15:13:39.4589254Z

Timestamp of when the release was deployed

Data

{"changeIds": ["6fc68dd0053c5e04bd29c322973a060a9d09d90a"]}

An array of commit ids that were received earlier as change events

Sample request

curl -v http://localhost:8080/events \
  -H "Ce-Specversion: 1.0" \
  -H "Ce-Type: net.leanix.valuestreams.release" \
  -H "Ce-Id: 26f148fb55df0fd244623d98fdd07cc425b1b8d6" \
  -H "Ce-Source: order-history" \
  -H "Ce-Time: 2021-11-18T15:13:39.4589254Z" \
  -H "Ce-Datacontenttype: application/json" \
  -H "Content-Type: application/json" \
  -d '{"changeIds":["6fc68dd0053c5e04bd29c322973a060a9d09d90a"], "releaseTime": "2021-11-18T15:13:39.4589254Z"}'

Incident event

The incident events contains the software artifact affected and a creation and resolution date.
An incident event is sent after an incident is resolved.

Properties

Property

Sample value

Description

SpecVersion

1.0

The version of the cloudEvents spec which the event uses

Type

net.leanix.valuestreams.incident

Type of event

Id

6fc68dd0053c5e04bd29c322973a060a9d09d90a

The incident id. The combination of id, source, and type shall be unique for each event

Source

order-history

Identifies the software artifact related to the incident, referenced through the Fact Sheet's external ID VSM Event ID

Time

2021-11-18T15:13:39.4589254Z

Timestamp of when the incident was resolved

Data

{"createdDate": "2021-11-18T15:13:39.000Z"}

Timestamp of when the incident was created

Sample incident event

curl -v http://localhost:8080/events \
  -H "Ce-Specversion: 1.0" \
  -H "Ce-Type: net.leanix.valuestreams.incident" \
  -H "Ce-Id: 26f148fb55df0fd244623d98fdd07cc425b1b8d6" \
  -H "Ce-Source: order-history" \
  -H "Ce-Time: 2021-11-18T17:13:39.4589254Z" \
  -H "Ce-Datacontenttype: application/json" \
  -H "Content-Type: application/json" \
  -d '{"createdDate":"2021-11-18T15:13:39.000Z"}'

Authorization

All the events shall have the Authorization header with an access_token as bearer.

Storing cloud events

All received cloud events are stored as "raw" events in the valuestreams database. At this point a first validation is performed to verify that the received events are valid against the defined format.

Monorepo Support

In order to leverage DORA metrics for monorepos it is needed to have one Software Artifact per monorepo-component, e.g., a folder. Then, each of these Software Artifacts needs to have an unique vsmEventId. As for other Software Artifacts, the Change, Release, and Incident events shall then reference these vsmEventId values.

📘

Handling of Monorepos in other integrations

The GitHub Repository Integration for VSM provides support for monoreposs based on manifest files. In CI/CD pipelines, support for monorepos can be added as well.

Let's assume the following example monorepo:

# Example-Monorepo
\- API-Service
\- Backend-Service
\- Frontend-Application
\- EventScheduler

For the DORA metrics, a dedicated Software Artifact is required for these different folders in this monorepo. It is recommended to use the combination of the monorepo itself and each folder name as the vsmEventIdfor these Software Artifacts: Example-Monorepo/API-Service, Example-Monorepo/Backend-Service, ....
For each change (also known as merged commit into the main branch) in the corresponding folder of the monorepo we expect to receive a Change via the Events API.

📘

In case there is only one release for the whole git repository instead of one per folder we would need to send all four Change and Release events per release so that all the Software Artifacts get the correct amount of deployments associated for the respective DORA metric.

DORA Metrics calculation and validation rules

Deployment Frequency

The deployment frequency metric tracks the frequency of deployments. New deployment frequency data points are populated whenever a net.leanix.valuestreams.release event is received.

The default case is that the released artifact is managed by just one team and in this case only one deployment frequency datapoint is populated for the managing team.

If the released artifact is managed by more than one team, then one deployment frequency data point is populated for every team whose members committed a change in the release.

❗️

About commit email addresses

The team member to team association is being done by the commit email address of the changes included in the release. The Deployment Frequency metrics accuracy will improve if the commit email address can be associated with the email of the team member in the team's Fact Sheet, via a subscription for the email address. Otherwise partial deployments will be counted for every team that manages the deployed software artifact.

Lead Time for Changes (LTFC)

The LTFC is essentially how long it takes a team to go from 'code committed' to 'code successfully running in production'.

Every net.leanix.valuestreams.change event contains the commit id of a code change. During processing, it is associated with the corresponding software artifact via its Source property and afterwards stored as an augmented change event in the valuestreams database.

Later on a net.leanix.valuestreams.release event is received. The release event contains a list of changeIds for previously received change events. Amongst those changeIds included in the release event, for each owning team the oldest change is determined. Then, the LTFC metrics are populated, and the LTFC value will equal the time period - in hours - between the time of the first change event and the release event time.

If a changeId can be associated with a team via the change author email then a single LTFC metric is populated, otherwise the event is ignored.

Validations

  • If multiple change events with the same commit(id), artifact(source) and time are received, then they will be ignored and reported in the logs.
  • If a release event contains changeIds included in a previous release event, then the new release event is ignored as invalid and reported in the logs.

Change Failure Rate (CFR)

The change failure rate(CFR) is the percentage of deployments causing a failure in production.
This means that for a specific time period we need to know how many deployments are associated with incidents.

The algorithm calculates the CFR whenever an incident event is received with a simplistic, time-based approach.
An incident is associated with the last release that was done before the incident was reported.

Mean time to recovery (MTTR)

The MTTR measures how long it takes to recover a service after an incident occurred. Whenever a new net.leanix.valuestreams.incident event is received, a MTTR metric is populated for every team that was involved in the release of the software artifact associated with the incident. The MTTR is computed as the duration between the creation and resolution of an incident. If there are multiple incidents reported for a timeframe affecting a team and software artifact, then the average of the duration is used.

Sample event log and DORA metrics results

Example teams

Let's assume that we have two teams in our VSM workspace, named Red and Blue. Team Red has two members Ralph and Mathilde. Team Blue has also two members Benjamin and again Mathilde. Both teams manage the same artifact - ms1. Notice that Mathilde is a cross-team member.

Team

Members

Managing artifacts

Blue

Benjamin, Mathilde

ms1

Red

Ralph, Mathilde

ms1

Example event log

Let's also assume we receive the following series of events in our API

Time

Event Type

Event Details

Dec 20, 2021 , 8:00am

Change

1st change in branch B1 for Microservice ms1

Contributor: Benjamin, member of Team Blue

Dec 20, 2021 , 8:30am

Change

2st change in branch B1 for Microservice ms1

Contributor: Benjamin, member of Team Blue

Dec 20, 2021 , 9:00pm

Release

branch B1 for Microservice ms1 is released

Dec 20, 2021 , 11:00am

Incident

1st incident for Microservice ms1 resolved

data

id: pd1

Source: ms1

created: 2021-12-20 10:00

Dec 21, 2021 , 8:00am

Change

1st change in branch B3 for Microservice ms1

Contributor: Mathilde, member of both Teams Blue and Red

Dec 21, 2021 , 9:00am

Release

branch B3 for Microservice ms1 is released

Resulting metrics

DORA metrics for team BlueDORA metrics for team Blue

DORA metrics for team Blue

1

Deployment frequency for the 20th of December is 1. This one release contains two commits by Benjamin, member of team Blue

2

Deployment frequency for the 21th of December is also 1. This one release contains one commit by Mathilde, also a team Blue member

3

Lead Time for Changes for the 20th of December is 1 hour. Note that only the oldest commit in the branch by Benjamin(commited at 08:00am) is taken into account when calculating this metric

4

Lead Time for Changes for the 21th of December is 1 hour

5

Change Failure Rate is 100% for the 20th of December. One deployment for team Blue and one incident linked to it

6

Change Failure Rate is 0% for the 21th of December. No incidents reported for this date

7

Mean Time To Recover for the incident reported at the 20th of December is 60 minutes

8

Ralph and Mathilde are shown in the team members section.

DORA Metrics for team RedDORA Metrics for team Red

DORA Metrics for team Red

1

Deployment frequency for the 21th of December is 1. This release contains the commit by cross team member member Mathilde

2

Lead Time for Changes for the 21th of December is 1 hour

3

Change Failure Rate is 0% since no incident is reported for the one release we had at the 21th of December

4

MTTR stats are empty since no incident is reported


Did this page help you?