GitHub Repository

Out-of-the-box Source Code Repository Integration

Introduction

The LeanIX VSM GitHub Repository integration offers the automated out-of-the-box creation and updating of LeanIX Software Artifact Fact Sheets. In this way, we provide repository information that can be linked to Software Artifacts to understand which Software Artifact are owned by which Teams. The integration also provides Software Artifact portfolio view that allows everyone in the organization to find the right Software Artifacts and know what Teams are owning them. VSM is focused on the high-level information of Repos.

2558

Integrate with GitHub to:

  • bring in repo data as Software Artifacts
  • augment your Software Artifacts with relations to owning teams and also used technologies

Setup

The GitHub repository connector scans your GitHub organization(s) to fetch your GitHub repositories that are not archived and adjacent information, such as Teams and Programming Languages used. The integration can be set up in full self-service via the Admin UI.

🚧

Support of GitHub versions

This connector currently only supports the GitHub Cloud-hosted versions (both free & enterprise)

This integration doesn't require any installation of a plugin-in, as the Integration Hub handles the entire integration.

Configuration in GitHub

To integrate GitHub with LeanIX VSM you need two inputs: 1) a Personal Access token 2) the GitHub organization(s). Here's how to get a hold of both:

1. Creating a Personal Access Token

  1. Please refer to the GitHub instructions
  2. Ensure you enable the following scopes:
    a) repo
    b) repo:status
    c) repo_deployment
    d) public_repo
    e) repo:invite
    f) security_events
    g) read:org
777

GitHub scopes to be set

📘

Authorizing for Single Sign On

If your GitHub Organization requires Single-Sign-On, then you will also need to authorize the token for single-sign-on (see below) or follow the detailed instructions via GitHub documentation.

839

Authorizing for organizations with SSO

2. Fetching GitHub organization(s)

To see the names of your managed organizations, follow these instructions

2401

Configuration in VSM

In your LeanIX VSM workspace, go to Administration > VSM Integrations and follow the instructions to set up the GitHub integration.

1001

GitHub Repository Integration setup in the Admin UI

📘

Sync Logging

Open "Sync Logging" tab to get understand the progress of your current integration run. Sync Logging also provides information on previous integration runs

2516

Options

Below are several ways to extend the base functionality of the GitHub repository integration to best tailor to your needs.

Monorepos

A monorepo is a software development strategy where code for many projects is stored in the same repository. The integration also detects a monorepo out of the box. If the parent repository is archived, all child repositories will be considered archived. Following is a guide to mark a repository as monorepo.

Step 1 | GitHub

  1. Add the sub-repository marker/ manifest file in respective directories. It can be any file, it just needs to be consistent across all monorepos.

Note: We do not parse the contents of the file as part of the GitHub repository integration. It only serves as a marker to identify services inside a repository.

.
└── onlineshop
    ├── payment-service
    │   ├── lx-manifest.yaml
    │   └── src
    └── pricing-service
        ├── lx-manifest.yaml
        └── src

Step 2 | Value Stream Management Workspace

  1. Check "Search for Monorepos" checkbox in the integration configuration
  2. Enter the marker/manifest file name to identify sub-repositories (e.g. lx-manifest.yaml). Only one manifest file name is accepted to identify a sub-repository.
1104

From the above setup, integration detects two sub-repositories (payment-service, pricing-service) and one main repository (onlineshop).

1071

👍

Sub-repositories detection

Sub repositories are only detected until level 2. For example,

.                              
└── onlineshop                   (level 1)
    ├── payment-service        (level 2)
        ├── lx-manifest.yaml
        └── src

For a tangible example please find a sample monorepo here

📘

Discovery of Monorepos via CI/CD

If you prefer to discover monorepo via your CI/CD build process you may refer to this guide.

DORA Metrics

Software Delivery Metrics are the best way for teams to track and monitor team productivity. The GitHub repository integration can automatically fetch information for two of the most important DORA metrics:

  1. Deployment frequency: Number of deployments to production for a Team.
  2. Lead time for changes: Time from committing a change to code successfully running in production.

Find below the mechanism by which the GitHub connector registers change and release events. Note that this (i.e. merge to default branch) is currently the only supported method of tracking these events.

.                              
└── Release Events on Software Artifact
    ├── Collect all Pull Requests merged into the **default branch** for the last 30 days and 
          register each merged Pull Request as a Release event
        ├── Collect all commits on each merged Pull Request and register each commit as a 
              Change Event along with commit author metadata

For the added logic for mono repo, please see below

Follow this guide to automatically fetch Release & Change events from your GitHub repositories

Prerequisites

  1. You have already discovered Software Artifacts (e.g. via GitHub repository integration, K8s etc.) in your VSM workspace. For DORA metrics for Monorepos, you will also have to have these discovered already.

  2. You already have imported teams and have added team members as subscribers.

1583

Adding team members as subscribers

This will then also populate your DORA Teams:

1120

Team setup for DORA

  1. You already have your Software Artifacts linked to your Teams (see below picture). This is required because your VSM UI needs this relation to show the metrics on Team level.
1034

Linking Teams and Software Artifacts

Monorepo support

The GitHub connector also supports fetching Release & Change events for mono repo structures. Let's illustrate how the connector does so with the below sample mono repo setup.

.                              
└── onlineshop                   (level 1)
    ├── payment-service        (level 2)
        ├── lx-manifest.yaml
        └── src
    ├── login-service        (level 2)
        ├── lx-manifest.yaml
        └── src

Scenario 1: Merged Pull Request with commits to 1 sub-repo

Let's assume you have committed changes to the payment-service and have merged the PR to the default branch of the onlineshop mono repo. The GitHub connector will assign all commits made against the payment-service as change events to that Software Artifact and it will register one release event for the merged PR to the default branch.

Scenario 2: Merged Pull Request with commits to 1 sub-repo and the parent mono-repo

Let's assume you have committed changes to the parent repo onlineshop(e.g. some common /shared functionality) and also have several commits to payment-service. The GitHub connector will assign all commits made against the payment-service as change events to that Software Artifact, all commits made to onlineshop as change events to the onlineshop Software Artifact. Two release events will be registered: one for onlineshop and one for payment-service.

Scenario 3: Merged Pull Request with commits to 2 sub-repos

Let's assume you have committed changes to both the services login-service and payment-service (which are both sub repos in the monorepo). The commits are part of a PR merged to the default branch of the onlineshop mono repo. The GitHub connector will assign all commits made to the payment-service as individual change events to that Software Artifact. Likewise, it will register all commits made to the login-service as individual change events to the Software Artifact login-service in VSM. For both Software Artifacts one release event is registered.

Enabling the GitHub integration to fetch Release & Change events

Step 1 | Enable the DORA Integration

Follow the DORA Integration documentation here

Step 2 | Go to the admin UI in your Value Stream Management Workspace

  1. Check "Send DORA release & change events" checkbox in the integration configuration
  2. Enter the "LeanIX Workspace host information" i.e the LeanIX hostname the connector is going to send DORA events, e.g.: app.leanix.net“

Additionally for mono repos:

  1. Enable the Search for Monorepos option and ensure that the sub-repos contain your specified marker file
  2. Ensure that those Software Artifacts also comply with the general requirements set out above
1244

This is a sample dashboard of how DORA metrics look for Team - Hook in VSM Workspace

2418

How do I fetch the two remaining DORA metrics: Change Failure Rate and Mean Time to Recovery?

These metrics are currently not supported by the out-of-the-box GitHub repository integration as data to reliably feed these metrics is normally not stored in GitHub. You can still bring these metrics in by calling the Events API directly. See a detailed instruction manual here.

Imported Data

Below you find how objects fetched from Github are translated into LeanIX Factsheets and attributes on them.

GitHub ObjectLeanIX Value Stream Management
Repository nameSoftware Artifact Fact Sheet
(org/name = external Id reference)
Sub-Repository nameSoftware Artifact Fact Sheet
(org/repository-name/folder-name = external Id reference)

Find more information here about monorepos and how it is supported out-of-the-box.
Repository ContributorTop 3 (based on the frequency of commits in the last 30 days) source code contributors are added as 'Observer' subscriptions with role "Source Code Contributor" to the respective Software Artifact Fact Sheet.

Note: Contributors can set their subscription type to “Responsible” for any given Fact Sheet manually whenever needed. The integration will then respect this change and will not move them back to "Observer".
Team entityTeam Fact Sheet
(name = external Id reference)
Team entity added to a repositoryRelation between Software Artifact and Team Fact Sheet.
Languages of a repositoryTechnology Fact Sheet
(name of language = external Id reference)

Technology Fact Sheet is related to the Software Artifacts where the language is used. Ex: Java, C++, JS etc..

The size of code (in kilo bytes) written in the respective language are mapped on the relation.

Note for Monorepos: GitHub only provides this information on the root level of the repository - not broken down to individual sub-services. We inherit this and only link languages at root-repository level.
TopicsAll topics are mapped as LeanIX tags of the tag group GitHub Topics by default.
ArchivedIndicates whether the repository is archived on GitHub or not. By default, if the repository is archived on the first run of the connector, the repository will not be pulled to the workspace. If the repository already exists in the workspace it will update the label "Archived" with the value "Yes" or "No" according to the latest status of the repository.
Additionally for Monorepos
Monorepos and related sub-repositoriesSub-repository Software Artifact Fact Sheets are related as children to Monorepos Software Artifact Fact Sheet
Team entity added to a sub-repositoryTeam(s) related to the main repository are the same team(s) related to the sub-repository Software Artifact Fact Sheets

Automatic Deletion of Fact Sheets

Currently, this integration does not automatically delete any Fact Sheet (Team, Software Artifact, Technology), if the equivalent object (repository, team, etc.) is deleted in GitHub. Note, that for archived repositories the integration changes the archived attribute (see above), which can already help as a filter in analysis. Automatic deletion is a feature want to support in the future out-of-the-box. For now, two options are available:

  1. manually via the excel import-export
  2. via a custom processor running as part of the integration (see this tutorial)

🚧

Teams

If the number of Teams in your GitHub organisation multiplied by the repositories per team is greater than 400.000 records, is it suggested to turn off the "import teams" flag and run the integration again.

GitHub API Rate Limits

The integration uses GitHub's GraphQL API. In case an organization has an exceptionally large number of repositories, GitHub API throttles the requests. Though the integration attempts to automatically recover from rate limiting, the integration takes more time to finish.

👍

Scheduling

Schedule the integration once a day if your organisation has (approx.) >2000 repositories and/or >200 teams