Release notes generation for GitHub repos

5 min read

Introduction

Today we're excited to announce the open sourcing of a tool to automatically generate markdown formatted release notes for GitHub repositories. Dolt is using this tool to generate our release notes going forward, and we've also used it to backfill our older releases.

Generated release notes contain summaries for every pull request merged and every issue closed since the previous release, ready to copy and paste into your release notes on GitHub. It also supports summarizing changes in dependencies for golang projects. Try it out today!

Our release notes were really bad

A customer called them "spartan." He was being generous. I would have used "embarrassing" or possibly "grossly negligent." Maybe even "execrable." Or "derelict."

bad release notes

As our CEO remarked: it looked like we hadn't done any work for three months.

The reason for this sorry state of affairs is simple: release notes are hard to assemble. We have 12 engineers at DoltHub now, and every release everybody was expected to report to the release manager a summary of what they had contributed since the last release. When even was that? I'm busy, Oscar.

It didn't work well. And eventually our release manager kind of gave up even trying.

If you visit our releases page today, you'll be treated to a veritable feast for the senses: a full summary of merged PRs and closed issues for every release we've ever put out.

good release notes

This is an example where automation doesn't just make a process faster and less labor intensive. It makes it actually better. It leads to a completely different end result.

There has to be a better way

Frustrated with the sorry state of affairs, I went looking for a solution. Everybody has this problem, right? Somebody must have solved it.

Turns out lots of people have. The first solution I found was Ted the Releaser, which I installed and found didn't work, at all. Gren is the semi-official solution, which I'm embarrassed to admit that I somehow... just didn't find in my original search until now. If I had, I probably would have used it instead of writing my own. Was I missing a keyword in Google or something?

But Gren doesn't handle changes in dependencies, which was important to me since so much of the development of the Dolt SQL query engine takes place in another package, go-mysql-server. I really wanted the release notes generated to automatically include all changes in this dependency, and I didn't want to have to keep the two packages' releases in sync to achieve this. So I'm going to pretend that I wrote my own release notes generator because I wanted this vital feature (which is true) and not because I'm apparently terrible at Google searches.

While I'm on my soapbox, it would be really great if release note generation was just a built-in feature of the web UI for GitHub and not a custom module I needed to bring to the table.

Assembling release notes

Like all the best dev ops tools, this one is cobbled together in Perl to glue together a bunch of system utilities into a unified whole. Specifically, it shells out to curl to make GitHub RPCs, interrogates the state of the local repo with git, and uses grep to determine dependency versions in a go.mod file.

For example: here's how we fetch the set of PRs for a release, after having determined the time range to consider. curl does all the HTTP stuff, and a standard JSON parser library turns the response file into a perl data object for us.

sub get_prs {
    my $base_url = shift;
    my $from_time = shift;
    my $to_time = shift;

    print STDERR "Looking for merged PRs between $from_time to $to_time\n";

    $base_url .= '?state=closed&sort=created&direction=desc&per_page=100';
    
    my $page = 1;
    my $more = 0;
    
    my @merged_prs;
    do {
        my $pulls_url = "$base_url&page=$page";
        my $curl_pulls = curl_cmd($pulls_url, $token);
        print STDERR "$curl_pulls\n";
        system($curl_pulls) and die $!;

        $more = 0;
        my $pulls_json = json_file_to_perl($curl_file);
        die "JSON file does not contain a list response" unless ref($pulls_json) eq 'ARRAY';
        
        foreach my $pull (@$pulls_json) {
            $more = 1;
            next unless $pull->{merged_at};

            return \@merged_prs if $pull->{created_at} lt $from_time;
            my %pr = (
                'url' => $pull->{html_url},
                'number' => $pull->{number},
                'title' => $pull->{title},
                'body' => $pull->{body},
                );

            push (@merged_prs, \%pr) if !$to_time || $pull->{merged_at} le $to_time;
        }

        $page++;
    } while $more;
    
    return \@merged_prs;
}

And here's how we determine the version of a dependency at the start and end of a release range by using grep.

# Returns the SHA version of the dependency named at the repository SHA given.
sub get_dependency_version {
    my $dependency = shift;
    my $hash = shift;

    my $cmd = "git show $hash:go/go.mod | grep $dependency";
    print STDERR "$cmd\n";
    my $line = `$cmd`;

    # TODO: this only works for commit versions, not actual releases like most software uses
    # github.com/dolthub/go-mysql-server v0.6.1-0.20210107193823-566f0ba75abc
    if ($line =~ m/\S+\s+.*-([0-9a-f]{12})/) {
        return $1;
    }

    die "Couldn't determine dependency version in $line";
}

Perl (get off my lawn) is uniquely great at cobbling together small utilities like this into a working solution. It's so natural to transition from command line experimentation directly to full blown product without even consciously deciding to.

chad runaway prototype

GitHub RPCs work well and are fast, but like any API you spend most of your development time understanding the data model and behavior through a combination of reading documentation (tedious, sometimes apocryphal) and reverse engineering (virtuous, thrilling). The details are mostly boring, and it's easy to read the script code to see the RPCs being made. We fetch pull requests, issues, and releases and then iterate over the results. There were some surprises in there, like the fact that the issues RPC also returns pull requests in the same namespace. But mostly making the RPCs was the easy part of this project, and GitHub deserves credit for a well-built API.

Example release notes

Merged PRs

  • 1170: Updating to latest go-mysql-server
  • 1169: go/libraries/doltcore/sqle: Keyless tables don't have PK index -- fix describe panic
  • 1168: /.github/{scripts,workflows}: fix, pod to job, handle pod errors
  • 1167: C# test for alternate MySQL connector library, upgraded existing to u… …se dotnet 5 (up from 3)
  • 1165: /.github/workflows/ci-performance-benchmarks.yaml: fix id
  • 1163: Db/ci performance
  • 1162: unrolled decode varint decode loop 30% faster on the benchmark in this PR. BenchmarkUnrolledDecodeUVarint/binary.UVarint-8 1000000000 0.0372 ns/op BenchmarkUnrolledDecodeUVarint/unrolled-8 1000000000 0.0258 ns/op
  • 1147: Fixed indentiation in YAML syntax for Discord notifications
  • 1143: First cut of Discrod notifications This implements the following policy:
    • notify on cancellation or failure of any job
    • notify on release, including success
    Currently release notifications are broken by a shortcoming in GitHub Actions, namely that one workflow cannot kick of another when using GITHUB_TOKEN. We will devise a workaround.
  • 256: added describe queries for keyless tables
  • 255: This function implement an Naryfunction type. Allows you to define sqle functions that have multiple children.
  • 254: Fixed UNHEX/HEX roundtrip Simple fix but I ended up completely reevaluating our binary type implementation. Fixed a bug found in the cast package we were using to convert strings, and also changed UNHEX to return the proper SQL type.
  • 252: Added hash functions
  • 249: Alias bug fixes Fixes a number of buggy behaviors involving column indexes and table name resolution.
  • 248: additional tests add a table with multiple keys an index that has a subset of those keys in a different order a couple queries

Closed Issues

  • 1161: Primary keyless tables seem to break DESCRIBE
  • 1153: p.StopWithErr(err) is hanging on large imports

Conclusion

Hopefully you find the tool useful for your own release note generation. If you're in the same boat as us, managing multiple golang repositories, then the unique capability to include changes from dependencies might make it worthwhile to stray off the more well-traveled path and give it a chance.

Have something to tell us about this, or about Dolt? Come chat with us on Discord. We're always happy to hear from new customers!

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.