Continuous Deployment with Github Actions: An Example

November 23, 2020

12 min read

Github Actions FTW

Not too long ago we endeavored to migrate Dolt's continuous integration pipeline from Jenkins to Github Actions. I wrote a blog about that process and complimented Github Actions on making the migration process intuitive and easy.

Today, we've continued improving our automation story by using Github Actions to define a continuous deployment (CD) pipeline for our DoltHub repository.

Prior to setting up this CD pipeline, members of our development team deployed DoltHub's services manually. This worked well for us over the last two years, but as our products and development team have grown and evolved manual deployments have become increasingly more annoying.

Manual Deployments Suck

DoltHub consists of several services written in golang, a GraphQL service, and a Next.js frontend. As part of the deployment technology stack we use Docker, Bazel, and Skaffold for building container images and deploying those images to our Kubernetes clusters. We use the Elastic Kubernetes Service by Amazon Web Services (AWS) to manage and host our clusters, and we use their Elastic Container Registry as our container image repository. As you can see, that's a lot of technologies for someone to learn in order to navigate deployment.

Since our deployment stack involves many tools and managed service integrations, one of the downsides we've experienced by relying solely on manual deployments has been managing the variability in deployment environments across each of our developer's workstations.

Manual deployments require each developer to setup and maintain all the required dependencies for deploying, ensure those dependencies are pegged at the right version number, and have the proper credentials configured for deploying each of our services.

Not having a single, centralized deployment environment shared by all developers meant our developers experienced the occasional deployment error resulting from a local issue on their workstation, halting their productivity and usually requiring more than one developer to debug.

This annoyance was a key motivating factor for why we chose to set up DoltHub deployments through Github Actions. Deploying via Github Actions frees our developers from individually maintaining deployment environments, instead providing them with a single, centralized environment only one person needs to maintain. It also makes it easier for our non-developer team members to deploy services, like our blog, via the Github UI which is also increases our productivity.

The Workflow

Workflow

Github Actions runs "workflows", basically a series of user defined programmatic tasks, on what it calls "runners". Each task is called a "job," and each job is made up of a collection of sequential steps or "actions." Hence the name Github Actions. Each step or action in a job will be executed in order, on a runner.

A runner is a combination of cloud hardware and a containerized environment. Because runners have predefined configurations and instructions, they provide a consistent execution environment that ensures the "actions" running on them run the same way every time. Github Actions even provides runners it hosts and maintains, making it very easy to choose the hardware and preliminary execution environment for a given job in a workflow. Selecting a runner is a easy as including a single line of yaml:

runs-on: ubuntu-latest

This runner, a Github Actions hosted runner, ubuntu-latest, was our choice for our CD pipeline as it comes with a number of pre-installed dependencies we need for deployments.

Some crucial dependencies required include Docker, Kubectl, and Bazel. To use this runner and define our continuous deployment workflow, we created a workflow configuration yaml file .github/workflows/continuous-deployment-dev.yaml.

Github Actions runs this file when triggered, and the "trigger," or event that starts a workflow run, can be a number of things. For our continuous-deployment-dev.yaml workflow, we use a push to branch master as the event that deploys DoltHub services to our development cluster.

Let's take a look at how we constructed this workflow that consists of two jobs, and succeeds by deploying updated DoltHub services.

The first of these jobs is what we've called the service-names job. This job helps us determine which services we should deploy dependent on the file changes that have been pushed to master. Here is a simplified illustration of the first job:

Set Service Names

Each green rectangle represents a step or action in the job, and each blue arrow represents a hand-off of data. The arrow from step 2 to step 3 represents the passage of a list of files between the two steps, and the arrow from step 4 to the red hexagon represents the hand-off of another list that will be used in the second job.

The second job is called build-deploy and it depends on the output of the first job. Here is a simple illustration representing the second job:

Build and Deploy Services

The build-deploy job uses the output of the service-names job, represented again by the red hexagon, to build the service images and deploy them to our development cluster.

Let's take a look at how each job is defined within our workflow file, and which actions make up each job.

name: Continuous Deployment Dev

on:
  push:
    branches:
      - "master"

jobs:
  service-names:
    name: Set Service Names
    runs-on: ubuntu-latest
    outputs:
      services: ${{ steps.set_service_names.outputs.services }}
    steps:
      - uses: actions/checkout@v2
      - uses: lots0logs/gh-action-get-changed-files@2.1.4
        id: changed-files
        with:
          token: ${{ secrets.GITHUB_TOKEN }}
      - name: Detect service changes
        uses: ./.github/actions/detect-service-changes
        id: service-changes
        with:
          changed-files: ${{ steps.changed-files.outputs.all }}
      - id: set_service_names
        run: |
          echo "::set-output name=services::$SERVICES"
        env:
          SERVICES: ${{ steps.service-changes.outputs.services }}

In the above snippet, starting at the top, we can see the name of the workflow, "Continuous Deployment Dev", followed by instructions telling Github Actions to run this workflow on pushes to our master branch. This section defines what event should "trigger" the workflow run.

Following the on section we have our first job called service-names which runs-on an ubuntu-latest runner. Ignore the outputs definition for now, and take a look at the steps section of our job where we list which actions to run, and the order in which to run them.

The first of these actions, actions/checkout@v2, makes our DoltHub repository available inside the runner so we can use it the subsequent steps of our job. This action, created and maintained here is simply a script or program, free to use and publicly available, that Github Actions will automatically run for us. We simply tell Github Actions which public actions to run by including the action's name and version tag in the uses step.

Next, we use another public action, one we found by browsing the Github Actions Marketplace for actions that list all changed files in the pushed commits.

We chose one called lots0logs/gh-action-get-changed-files@2.1.4. This action determines which files have changed, makes a list of those files, and makes the list available via an outputs variable called steps.<id>.outputs.all (notice <id> should be replaced with the id of the step).

Having a list of changed files allows us to determine which of our services require deployment. For example, if a file in our web/packages/blog directory changes, we want to automatically deploy our blog service.

Setting the list of changed files to an outputs variable is an important step because it allows us to pass this list to subsequent steps within this job. The lots0logs/gh-action-get-changed-files@2.1.4 action also takes, as input, an argument called token which must be the github token of our DoltHub repository.

Passing arguments to actions requires the with yaml field followed by the argument name and value. Notice too that Github Actions fully supports encrypted secrets which enables the secure usage of secrets, even redacting secrets from workflow logs if they are printed.

Next, we take the list of changed files stored in the steps.changed-files.outputs.all variable and use it as input to our next action called ./.github/actions/detect-service-changes in a step we've name "Detect service changes." Notice, though, that this action looks different then the previous actions we've used in our workflow that are publicly available on Github.

Unlike the previously used public actions, this action's name is a path to a private action, one we've defined ourselves inside our DoltHub repository! We decided to create a custom action for this workflow since we needed a way to map the list of changed files to particular services that should be deployed. Let's take a brief detour to explore how we created this custom, private action.

Actions in Action

To create it, first we created a directory inside our DoltHub repository called ./.github/actions/detect-service-changes. Then, from that directory, we initialized a node environment with npm init -y and created two files required for defining an action, action.yaml and index.js.

The action.yaml file defines the inputs and outputs of an action, ours contains the following:

name: "Detect service changes"
description: "determines which services are affected by current changes"
inputs:
  changed-files:
    description: "list of changed files"
    required: true
outputs:
  services:
    description: "array containing all changed services"
runs:
  using: "node12"
  main: "dist/index.js"

We can see from the above snippet that this action expects to take a list, or array, of files as input, and will produce an array of service names as output. The output array is stored in a variable called steps.<id>.outputs.services to be used in subsequent actions or steps.

The index.js file contains JavaScript that maps changed files to an array of service names and sets the output variable using the Github Actions npm package @actions/core.

const core = require("@actions/core");

// service prefixes
const docsPrefix = "web/packages/docs";
const blogPrefix = "web/packages/blog";
const doltRemoteAPIPrefix = "go/services/doltremoteapi";
const dolthubAPIPrefix = "go/services/dolthubapi";
const graphqlPrefix = "web/packages/graphql-server";
const dolthubPrefix = "web/packages/dolthub";

// service names
const docs = "docs";
const blog = "dolthubblog";
const doltRemoteAPI = "doltremoteapi";
const dolthubAPI = "dolthubapi";
const graphql = "dolthubapi-graphql";
const dolthub = "dolthub";

function getPrefixMap() {
  const servicePrefixToOutputMap = new Map();
  servicePrefixToOutputMap.set(docsPrefix, docs);
  servicePrefixToOutputMap.set(blogPrefix, blog);
  servicePrefixToOutputMap.set(doltRemoteAPIPrefix, doltRemoteAPI);
  servicePrefixToOutputMap.set(dolthubAPIPrefix, dolthubAPI);
  servicePrefixToOutputMap.set(graphqlPrefix, graphql);
  servicePrefixToOutputMap.set(dolthubPrefix, dolthub);
  return servicePrefixToOutputMap;
}

try {
  const prefixMap = getPrefixMap();
  const changedFiles = JSON.parse(core.getInput("changed-files"));
  const changedServicesMap = changedFiles.reduce((acc, file) => {
    prefixMap.forEach((serviceName, servicePrefix) => {
      if (file.startsWith(servicePrefix)) {
        console.log(
          `found changes for service ${serviceName}, updating output`
        );
        acc[serviceName] = true;
      }
    });
    return acc;
  }, {});
  core.setOutput("services", Object.keys(changedServicesMap));
} catch (error) {
  core.setFailed(error.message);
}

We compiled our custom action using ncc build index.js, which compiles all code and modules into a single dist/index.js. This action is now fully usable within our workflow, so let's finish breaking down the final steps in our service-names job.

With the steps.service-changes.outputs.services variable set by our custom action, we run the final step in the service-names job. This step has the id: set_service_names and takes the array of service names from our custom action as an environment variable called SERVICES.

We use that variable in the run command of this final step, which essentially runs shell commands on the runner. We use some Github Action specific syntax in this context to echo a string that will set the job.<id>.outputs.services variable of our service-names job.

We can now return to the outputs section of the job we glanced over earlier. In the same way a step or action can have outputs that store values usable by subsequent steps, a job can have outputs usable by subsequent jobs. The jobs.<id>.outputs.services variable is what we utilize in our second job and we set this variable in the final step of our first job. Recall that this job output was represented in the job illustrations by the red hexagon. The first job stores data in the red hexagon, and the second job uses that data.

Once set successfully, the first job completes and the workflow starts to run the second job. The second job then iterates over the array of services it received, building each service image and deploying it.

Building and Deploying Services

build-deploy:
  name: Build and Deploy Services to Dev
  needs: service-names
  runs-on: ubuntu-latest
  strategy:
    matrix:
      service: ${{ fromJson(needs.service-names.outputs.services) }}
  steps:
    - uses: actions/checkout@v2
    - name: Install skaffold
      run: |
        curl -Lo skaffold https://storage.googleapis.com/skaffold/releases/latest/skaffold-linux-amd64 && \
          sudo install skaffold /usr/local/bin/
        skaffold version
    - name: Install aws-iam-authenticator
      run: |
        curl -o aws-iam-authenticator https://amazon-eks.s3.us-west-2.amazonaws.com/1.18.8/2020-09-18/bin/linux/amd64/aws-iam-authenticator && \
          chmod +x ./aws-iam-authenticator && \
          sudo cp ./aws-iam-authenticator /usr/local/bin/aws-iam-authenticator
        aws-iam-authenticator version
    - name: Configure AWS Credentials
      uses: aws-actions/configure-aws-credentials@v1
      with:
        aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
        aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
        aws-region: us-east-1
        role-to-assume: ${{ secrets.AWS_ROLE_TO_ASSUME }}
    - name: Login to Amazon ECR
      id: login-ecr
      uses: aws-actions/amazon-ecr-login@v1
      with:
        registries: 1234567890

    - name: Build Images
      working-directory: ${{ format('./{0}/dev', matrix.service) }}
      run: |
        echo "Building images for service $SERVICE"
        skaffold build ... --file-output=./images.json
      env:
        SERVICE: ${{ matrix.service }}
    - uses: EndBug/add-and-commit@v5
      with:
        message: ${{ format('[ga-deploy] update dev images for {0}', matrix.service) }}
        add: ${{ format('./{0}/dev/images.json', matrix.service) }}
        cwd: "."
      env:
        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
    - name: Deploy Images Dev
      working-directory: ${{ format('./{0}/dev', matrix.service) }}
      run: |
        echo "Deploying service $SERVICE"
        skaffold deploy ... --build-artifacts=./images.json
      env:
        SERVICE: ${{ matrix.service }}

This job starts in a similar way to our service-names job, but includes needs: service-names as a yaml field. This signals to Github Actions that our second job is dependent on the first, and should only run after the first job has completed successfully.

From there, we specify the runner this job should run on, then we make use of the job.strategy.matrix. This matrix, which accepts an array, will automatically run our build-deploy job for each item in the array, in parallel.

For this reason, this job works more like a template for building and deploying a single service, to be used for all services in our services array! We can represent the value of each service by using the matrix.service variable. This variable can be used in the workflow syntax itself and can even be passed to runner's shell using an environment variable. You'll notice we do this with the environment variable we've called SERVICE.

As I mentioned earlier, Skaffold is our main tool for building and deploying and because it is not pre-installed on the runner, we need to install it. We do this in a separate step following our actions/checkout@v2 step, then follow our Skaffold installation with the installation of aws-iam-authenticator, which Skaffold uses "under the hood."

With all dependencies installed, the final steps for deployment require correctly providing the AWS credentials so the Github Actions runner can deploy to our managed development cluster. Thankfully, AWS has saved us time and code by supplying it's own public actions that make it much easier to configure our runner's credentials.

We use the aws-actions/configure-aws-credentials@v1 action to set the proper AWS environment variables which grant our runner access to operate on our AWS resources. We follow this action with the aws-actions/amazon-ecr-login@v1 action that uses on the AWS environment variables set by aws-actions/configure-aws-credentials@v1, to authenticate our runner's use of our ECR repository.

In our "Build Images" step, we use Github Action's helpful format function to dynamically change the working directory where our build step runs. We then use skaffold build with the --file-output flag to build a service image and write the image's tag to a json file.

Next, we use the action EndBug/add-and-commit@v5 to commit the json file to our DoltHub repository, mid-workflow. This enables us to explicitly version and track every service image we deploy. This too is a substantial improvement over our manual deployments which were not configured to commit service image tags to our DoltHub repository. Omitting this step made principally tracking and monitoring our development deployments much, much harder.

Lastly, the "Deploy Images" step works exactly the same way the "Build Images" step does, the only difference being skaffold deploy is run instead of skaffold build. This command is run with the --build-artifacts flag where we pass in the json file containing the image tag we created during the "Build Images" step. This tells Skaffold to deploy that service at that specific tag.

If no errors occurred during this entire workflow process, the workflow succeeds, and our services are deployed automatically! If an error does occur, the workflow run fails and Github Actions notifies our team. Error messages are available in the workflow's logs which can be linked to with a url and easier debugging.

Conclusion

This is one just one of the Github Actions workflows we've created to make deployment of DoltHub services better and easier. We also created workflows that can be manually triggered by our team using Github's UI. Those workflows, too, are a huge improvement over our previous local deployment story and we are excited to continue making our deployments better, faster, and more automated 🤖.

One feature we'd love to see from Github Actions is the ability to block a workflow run until a user decides to manually advance it. We believe this feature would streamline our current process for promoting our development deployments to production. At the time of this writing that feature is not yet supported in Github Actions.

Curious about Dolt, DoltHub and the versioned database of the future? There's no better place to get started than DoltHub.com where you can download Dolt, host your own public and private repositories, or just clone some amazing public repositories you won't find anywhere else.

Questions, comments, or looking to start publishing your data in Dolt? Get in touch with our team here!

Blog