GitHub Enterprise (Self-hosted)

Out-of-the-box Source Code Repository Integration for self-hosted GitHub deployments

Introduction

:rocket:Generally Available

This integration is in generally available. To find more information about the release stages of our integrations, see Release Stages.

The LeanIX VSM GitHub Repository integration offers an easy way to auto-discover all your services from your self-hosted GitHub Enterprise instance. Based on this VSM's mapping inbox allows you to easily sift through all ingested repositories from GitHub to decide, which services are really useful to your organization and hence should be part of your service catalog. This will help you to maintain a high standard of data quality when you subsequently map your services to their individual teams to create clear team ownership.

Integrate with GitHub Enterprise to:
  • Automatically discover your services to build your company-wide service catalog
  • Map team ownership to have clear software governance in place
  • Automatically get Change & Release events for your DORA metrics

Setup

The integration runs as a dockerized agent to continuously fetch your GitHub data and pass it into VSM. See the technical details on the project's page..

Configuration in GitHub & VSM

To set up the integration follow the below steps:

  1. Set up a service account in your GitHub instance, and make it a member of the GitHub organizations that need to be scanned as part of this service account. We recommend using service accounts to manage permissions more easily.
  2. Create a Personal Access Token with the following scopes (see the GitHub Enterprise documentation):
    1. repo
    2. admin:org_hook

📘

Authorizing for Single Sign On

If your GitHub Organization requires Single-Sign-On, then you will also need to authorize the token for single-sign-on ( see below) or follow the detailed instructions via GitHub documentation.

839

Authorizing organizations with SSO

  1. Go to the admin panel in VSM > Ìntegrations and start the setup flow for GitHub Enterprise. You'll need the following information at your disposal:

    1. The PAT token from #2
    2. The GitHub organizations you want to be scanning (just make sure the PAT has access to all)-
    3. The URL for your GitHub Enterprise instance. In most cases, it will be something like: https://ghe.domain.com. Note: don't forget the https prefix :stuck-out-tongue-winking-eye:
    4. The URL where you host the VSM broker (including the port). It will most likely look something like this: http://vsm.client:8080
  2. After saving, the setup wizard will output a docker command along the lines of:

docker run --pull=always --restart=always \
           -p 8080:8080 \
           -e LEANIX_DOMAIN=<region>.leanix.net \
           -e LEANIX_API_TOKEN=<technical_user-token>\
           -e LEANIX_CONFIGURATION_SET_NAME=<config-set-name>\
           -e GITHUB_TOKEN=<secret-github-token> \
           -e GITHUB_URL=<GitHub Ent URL(https://ghe.domain.com)> \
           -e BROKER_URL=<vsm-github-broker URL(http://my.vsm.broker.client:8080)> \
        leanixacrpublic.azurecr.io/vsm-github-broker

📘

Changing local parameters (e.g. PAT token, Broker URL ...)

If you change one of the local parameters that are not directly managed by VSM, namely Broker URL, GitHub PAT, GitHub URL then you will need to manually redeploy the docker image with the updated local values. You need the LEANIX_CONFIGURATION_SET_NAME, which you can get from the integrations UI in VSM to point the local agent to the same configuration in VSM so that the integration resumes but with updated local parameters.

  1. Deploy the docker container in your preferred deployment mode (e.g. via K8s, via virtual machine ...)
  2. Shortly after initialization the agent will connect with VSM and you should see logs appearing in the log section of the integration panel in VSM.
  3. Map your repositories to services to start getting your technology stack under full control

Multi-org support

VSM allows you to scan multiple GitHub organizations with the VSM GitHub broker. There are two scenarios:

Scenario I: I want to scan multiple GitHub organizations with one service account (=one PAT)

This case applies if you want to bundle all or some GitHub organizations under the same PAT. Commonly, this occurs if you have one GitHub admin that oversees all GitHub organizations. You can then create one VSM config set containing all GitHub organizations, managed by a single service account and PAT. You will receive & need to run the docker container, as outlined above. The configuration set will appear in the VSM admin panel and will receive the logs for all GitHub organizations in the config set.

Scenario II: I want to scan multiple GitHub organizations with different service accounts (=multiple PATs)

This case applies if you want to manage your GitHub organizations in different cohorts (i.e. configuration sets). This commonly occurs if you have a GitHub admin that oversees some GitHub organizations (= business units) but not all. This then helps to parallelize onboarding by allowing each GitHub admin to manage their VSM config separately. This setup results in multiple VSM config sets containing one or many GitHub organizations that will be scanned. You will receive & need to run one docker container per VSM configuration set. Each configuration set will appear in the VSM admin panel as one entity, for which it receives logs.

GitHub API Rate Limits

GitHub Enterprise comes with API rate limits by default.

The default rate limit is 60 requests per hour for unauthenticated requests and 5000 requests per hour for authenticated requests using a personal access token (PAT).

If face rate limit errors on the broker, you should consider increase the rate limit on your GitHub Enterprise instance configuration.

Please refer to rate limit documentation for more information.

👍

Scheduling

Schedule the integration once a day if your organization has (approx.) >2000 repositories and/or >200 teams

DORA Metrics

Software Delivery Metrics are the best way for teams to track and monitor team productivity. The GitHub repository integration can automatically fetch information for two of the most important DORA metrics:

  1. Deployment frequency: Number of deployments to production for a Team.
  2. Lead time for changes: Time from committing a change to code successfully running in production.

Find below the mechanism by which the change and release events are registered.

Registering Release Event

All closed or merged pull requests in our public repository ps-scripts

Pull requests which are merged, and are less than 30 days old are valid. There is only a valid pull request among the 4 closed pull requests.

This will be registered as release event along with the commits change Ids from below.

Registering Change Events

The pull request has 3 commits.

The 3 commits are registered as change events along with author details.

Imported data

The integration retrieves the following pieces of information from your GitHub Enterprise instance:

  1. Metadata, such as repository name, description, URL, etc.
  2. Repository status e.g. archived, active
  3. Repository code composition e.g. typescript
  4. Repository topics e.g. frontend.
  5. Repository visibility e.g. private
  6. DORA metrics: Deployment frequency and Lead-time-for-changes

FAQs

Does the integration run in real-time or schedule-based?

Both are possible. Per default, the VSM GitHub broker will listen to GitHub webhook events to provide a near-real-time user experience in VSM. We encourage you to stick with this default. Scheduled runs are used to recuperate from potential intermittent issues (such as network failures etc.).

If your organization doesn't allow for this mode, you can toggle the webhook functionality off (see the details here). The VSM GitHub broker will then only run on a once-per-day schedule.

What are the required PAT scopes needed for?

For the PAT token the scopes are needed to perform the operations mentioned here.

The VSM GitHub broker also registers the following events in order to perform the following actions: see here.

What are the parameters managed locally(on the agent side) and the parameters managed by VSM Integrations UI?

'Name of the configuration set' (LEANIX_CONFIGURATION_SET_NAME) and 'Name(s) of the GitHub organizations' are managed within VSM Integrations UI.

All other parameters used are added (with -eoption of docker run) and managed locally on the agent side.