GOLANG

9 min read

We’re using Go to write Dolt, the world’s first and only version-controlled SQL database.

dolt loves go

One of the things we like about Go is its ability to build binaries for different operating systems and architectures, regardless of the machine you’re building on. This means we can build the binaries for all our supported release targets on the same machine, no need to spin up a machine of each target architecture for its own build. This simplifies our release and development process.

But. Our build story became a lot more complicated when we took our first cgo dependency. Since breaking the seal of no longer being a pure-Go project, we’ve added a couple more. Our daily build invocation looks pretty much the same as before for any particular developer, but everyone needs to download the appropriate dependencies for their platform (we have engineers working on both Mac and Windows, and we all use Linux as well). And everyone has to set environment variables to tell Go what cgo flags to use during builds. For me, on Mac OS, this looks something like the following lines in my .zshrc file.

export LDFLAGS="-L/opt/homebrew/opt/icu4c@77/lib"
export CGO_LDFLAGS="-L/opt/homebrew/opt/icu4c@77/lib"
export CPPFLAGS="-I/opt/homebrew/opt/icu4c@77/include"
export CGO_CPPFLAGS="-I/opt/homebrew/opt/icu4c@77/include"

This is all well and good for those of us working on the project as our full time job. But we’re also an open source project, and we want our customers to be able to build from source without needing to understand the intricacies of our build logic.

In times past, the best most projects could offer was a tool like autoconf combined with a complex Makefile. Those have fallen out of favor for newer projects, due both to their inscrutability and the difficulty in debugging problems. Luckily, today we have a better option, and our customers already have it installed: Docker.

Writing a cross-platform build script to run in Docker#

We want this script to work for any provided platform and architecture. So let’s begin by defining some constants for the flags required on each of these platforms.

declare -A platform_cc
platform_cc["linux-arm64"]="aarch64-linux-musl-gcc"
platform_cc["linux-amd64"]="x86_64-linux-musl-gcc"
platform_cc["darwin-arm64"]="clang-19 --target=aarch64-darwin --sysroot=/opt/cross/darwin-sysroot -mmacosx-version-min=12.0"
platform_cc["darwin-amd64"]="clang-19 --target=x86_64-darwin --sysroot=/opt/cross/darwin-sysroot -mmacosx-version-min=12.0"
platform_cc["windows-amd64"]="x86_64-w64-mingw32-gcc"

These correspond to the CC environment variable we’ll provide the go build command for each target platform. As you can read in the Go docs on this topic, this flag determines the compiler that will be used to build any required C code. The compiler to use depends on the platform and architecture we’re targeting.

When either cgo or SWIG is used, go build will pass any .c, .m, .s, .S or .sx files to the C compiler, and any .cc, .cpp, .cxx files to the C++ compiler. The CC or CXX environment variables may be set to determine the C or C++ compiler, respectively, to use.

The CXX variable provides the same thing, but for C++ code.

declare -A platform_cxx
platform_cxx["linux-arm64"]="aarch64-linux-musl-g++"
platform_cxx["linux-amd64"]="x86_64-linux-musl-g++"
platform_cxx["darwin-arm64"]="clang++-19 --target=aarch64-darwin --sysroot=/opt/cross/darwin-sysroot -mmacosx-version-min=12.0 --stdlib=libc++"
platform_cxx["darwin-amd64"]="clang++-19 --target=x86_64-darwin --sysroot=/opt/cross/darwin-sysroot -mmacosx-version-min=12.0 --stdlib=libc++"
platform_cxx["windows-amd64"]="x86_64-w64-mingw32-g++"

Next we need to provide the same info for the linker, which is what these two ld variables do.

declare -A platform_go_ldflags
platform_go_ldflags["linux-arm64"]="-linkmode external -s -w"
platform_go_ldflags["linux-amd64"]="-linkmode external -s -w"
platform_go_ldflags["darwin-arm64"]="-s -w -compressdwarf=false -extldflags -Wl,-platform_version,macos,12.0,14.4"
platform_go_ldflags["darwin-amd64"]="-s -w -compressdwarf=false -extldflags -Wl,-platform_version,macos,12.0,14.4"
platform_go_ldflags["windows-amd64"]="-s -w"

declare -A platform_cgo_ldflags
platform_cgo_ldflags["linux-arm64"]="-static -s"
platform_cgo_ldflags["linux-amd64"]="-static -s"
platform_cgo_ldflags["darwin-arm64"]=""
platform_cgo_ldflags["darwin-amd64"]=""
# Stack smash protection lib is built into clang for unix platforms,
# but on Windows we need to pull in the separate ssp library
platform_cgo_ldflags["windows-amd64"]="-static-libgcc -static-libstdc++ -Wl,-Bstatic -lssp"

Lastly, we need the same information for the assembler, which is not used by the Go build process directly, but by certain compilers in some situations.

declare -A platform_as
platform_as["linux-arm64"]="aarch64-linux-musl-as"
platform_as["linux-amd64"]="x86_64-linux-musl-as"
platform_as["darwin-arm64"]="clang-19 --target=aarch64-darwin --sysroot=/opt/cross/darwin-sysroot -mmacosx-version-min=12.0"
platform_as["darwin-amd64"]="clang-19 --target=x86_64-darwin --sysroot=/opt/cross/darwin-sysroot -mmacosx-version-min=12.0"
platform_as["windows-amd64"]="x86_64-w64-mingw32-as"

Once we have all this info, we can write our go build command like this.

OS_ARCH_TUPLES="windows-amd64 linux-amd64 linux-arm64 darwin-amd64 darwin-arm64"

for tuple in $OS_ARCH_TUPLES; do
  os=`echo $tuple | sed 's/-.*//'`
  arch=`echo $tuple | sed 's/.*-//'`
  o="out/doltgresql-$os-$arch"
  mkdir -p "$o/bin"
  mkdir -p "$o/licenses"
  cp -r ./licenses "$o/licenses"
  cp LICENSE "$o/licenses"
  echo Building "$o/$bin"
  obin="$bin"
  if [ "$os" = windows ]; then
      obin="$bin.exe"
  fi
  CGO_ENABLED=1 \
      GOOS="$os" \
      GOARCH="$arch" \
      CC="${platform_cc[${tuple}]}" \
      CXX="${platform_cxx[${tuple}]}" \
      AS="${platform_as[${tuple}]}" \
      CGO_LDFLAGS="${platform_cgo_ldflags[${tuple}]}" \
      go build -buildvcs=false -trimpath \
      -ldflags="${platform_go_ldflags[${tuple}]}" \
      -tags icu_static -o "$o/bin/$obin" \
      ./cmd/doltgres
  if [ "$os" = windows ]; then
    (cd out && 7z a "doltgresql-$os-$arch.zip" "doltgresql-$os-$arch" && 7z a "doltgresql-$os-$arch.7z" "doltgresql-$os-$arch")
  else
    tar cf - -C out "doltgresql-$os-$arch" | pigz -9 > "out/doltgresql-$os-$arch.tar.gz"
  fi
done

There’s a couple additional flags we use that have nothing to do with cgo and might not be appropriate for every binary:

-buildvcs=false: omit the git hash in the build
-trimpath: omit the full path of source files in symbol info, make them relative to the project root
-tags icu_static: found in our ICU integration library, instructs the build process to skip certain files. We also support dynamic linking of the ICU library, but don’t do this for production builds.

Finally, there’s the question of dependencies, such as the compilers themselves. We get those as system dependencies at the top of the script. Some of them come from the distro-provided apt sources and some of them are toolchain dependencies which we have to maintain for our target architectures.

apt-get update && apt-get install -y p7zip-full pigz curl xz-utils mingw-w64 clang-19

cd /
curl -o optcross.tar.xz https://dolthub-tools.s3.us-west-2.amazonaws.com/optcross/"$(uname -m)"-linux_20250327_0.0.3_trixie.tar.xz
tar Jxf optcross.tar.xz
curl -o icustatic.tar.xz https://dolthub-tools.s3.us-west-2.amazonaws.com/icustatic/20250327_0.0.3_trixie.tar.xz
tar Jxf icustatic.tar.xz
export PATH=/opt/cross/bin:"$PATH"

Putting it all together#

Now that we understand the pieces, we can look at the entire script from start to finish.

#!/bin/bash

# build_binaries.sh
#
# Builds doltgres binaries with os-arch tuples provided as arguments, e.g. windows-amd64 linux-amd64
#
# This script is intended to be run in a docker environment via build.sh or
# build_all_binaries.sh. You can also use it to build locally, but it needs to run apt-get and other
# commands which modify the system.
#
# To build doltgres for the OS / arch of this machine, use build.sh.

set -e
set -o pipefail
apt-get update && apt-get install -y p7zip-full pigz curl xz-utils mingw-w64 clang-19

cd /
curl -o optcross.tar.xz https://dolthub-tools.s3.us-west-2.amazonaws.com/optcross/"$(uname -m)"-linux_20250327_0.0.3_trixie.tar.xz
tar Jxf optcross.tar.xz
curl -o icustatic.tar.xz https://dolthub-tools.s3.us-west-2.amazonaws.com/icustatic/20250327_0.0.3_trixie.tar.xz
tar Jxf icustatic.tar.xz
export PATH=/opt/cross/bin:"$PATH"

cd /src

OS_ARCH_TUPLES="$*"

declare -A platform_cc
platform_cc["linux-arm64"]="aarch64-linux-musl-gcc"
platform_cc["linux-amd64"]="x86_64-linux-musl-gcc"
platform_cc["darwin-arm64"]="clang-19 --target=aarch64-darwin --sysroot=/opt/cross/darwin-sysroot -mmacosx-version-min=12.0"
platform_cc["darwin-amd64"]="clang-19 --target=x86_64-darwin --sysroot=/opt/cross/darwin-sysroot -mmacosx-version-min=12.0"
platform_cc["windows-amd64"]="x86_64-w64-mingw32-gcc"

declare -A platform_cxx
platform_cxx["linux-arm64"]="aarch64-linux-musl-g++"
platform_cxx["linux-amd64"]="x86_64-linux-musl-g++"
platform_cxx["darwin-arm64"]="clang++-19 --target=aarch64-darwin --sysroot=/opt/cross/darwin-sysroot -mmacosx-version-min=12.0 --stdlib=libc++"
platform_cxx["darwin-amd64"]="clang++-19 --target=x86_64-darwin --sysroot=/opt/cross/darwin-sysroot -mmacosx-version-min=12.0 --stdlib=libc++"
platform_cxx["windows-amd64"]="x86_64-w64-mingw32-g++"

declare -A platform_as
platform_as["linux-arm64"]="aarch64-linux-musl-as"
platform_as["linux-amd64"]="x86_64-linux-musl-as"
platform_as["darwin-arm64"]="clang-19 --target=aarch64-darwin --sysroot=/opt/cross/darwin-sysroot -mmacosx-version-min=12.0"
platform_as["darwin-amd64"]="clang-19 --target=x86_64-darwin --sysroot=/opt/cross/darwin-sysroot -mmacosx-version-min=12.0"
platform_as["windows-amd64"]="x86_64-w64-mingw32-a"sgolan

declare -A platform_go_ldflags
platform_go_ldflags["linux-arm64"]="-linkmode external -s -w"
platform_go_ldflags["linux-amd64"]="-linkmode external -s -w"
platform_go_ldflags["darwin-arm64"]="-s -w -compressdwarf=false -extldflags -Wl,-platform_version,macos,12.0,14.4"
platform_go_ldflags["darwin-amd64"]="-s -w -compressdwarf=false -extldflags -Wl,-platform_version,macos,12.0,14.4"
platform_go_ldflags["windows-amd64"]="-s -w"

declare -A platform_cgo_ldflags
platform_cgo_ldflags["linux-arm64"]="-static -s"
platform_cgo_ldflags["linux-amd64"]="-static -s"
platform_cgo_ldflags["darwin-arm64"]=""
platform_cgo_ldflags["darwin-amd64"]=""
# Stack smash protection lib is built into clang for unix platforms,
# but on Windows we need to pull in the separate ssp library
platform_cgo_ldflags["windows-amd64"]="-static-libgcc -static-libstdc++ -Wl,-Bstatic -lssp"

for tuple in $OS_ARCH_TUPLES; do
  os=`echo $tuple | sed 's/-.*//'`
  arch=`echo $tuple | sed 's/.*-//'`
  o="out/doltgresql-$os-$arch"
  mkdir -p "$o/bin"
  mkdir -p "$o/licenses"
  cp -r ./licenses "$o/licenses"
  cp LICENSE "$o/licenses"
  echo Building "$o/$bin"
  obin="$bin"
  if [ "$os" = windows ]; then
      obin="$bin.exe"
  fi
  CGO_ENABLED=1 \
      GOOS="$os" \
      GOARCH="$arch" \
      CC="${platform_cc[${tuple}]}" \
      CXX="${platform_cxx[${tuple}]}" \
      AS="${platform_as[${tuple}]}" \
      CGO_LDFLAGS="${platform_cgo_ldflags[${tuple}]}" \
      go build -buildvcs=false -trimpath \
      -ldflags="${platform_go_ldflags[${tuple}]}" \
      -tags icu_static -o "$o/bin/$obin" \
      ./cmd/doltgres
  if [ "$os" = windows ]; then
    (cd out && 7z a "doltgresql-$os-$arch.zip" "doltgresql-$os-$arch" && 7z a "doltgresql-$os-$arch.7z" "doltgresql-$os-$arch")
  else
    tar cf - -C out "doltgresql-$os-$arch" | pigz -9 > "out/doltgresql-$os-$arch.tar.gz"
  fi
done

Running in Docker#

This build script is designed to be run with Docker, which we know from experience that almost all of our customers already have installed (and is how most of them deploy the database server). We do this without a Dockerfile, just telling Docker to run the script directly.

#!/bin/bash

# build.sh
#
# This script builds doltgres from source for this machine's OS and architecture.
# Requires a locally running docker server.

set -e
set -o pipefail

script_dir=$(dirname "$0")
cd $script_dir/..

go_version=`go version | cut -d" " -f 3 | sed -e 's|go||' | sed -e 's|\.[0-9]$||'`
os=`go version | cut -d" " -f 4 | sed "s|/.*||"`
arch=`go version | cut -d" " -f 4 | sed "s|.*/||"`

echo "os is $os"
echo "arch is $arch"
echo "go version is $go_version"

# Run the build script in docker, using the current working directory as the docker src
# directory. Packaged binaries will be placed in out/
docker run --rm \
       -v `pwd`:/src \
       golang:"$go_version"-trixie \
       /src/scripts/build_binaries.sh "$os-$arch"

This docker command runs the official Go docker container, giving it the script above to execute. There’s a couple interesting things happening here.

There’s no Dockerfile. We’re just using a pre-built image and running a bash script on it.
--rm tells docker that it should discard the container’s storage upon exit
pwd:/src mounts the current directory (the root of the project) as /src in the container
The path out/ in the build script is relative to the package root as well, and isn’t deleted on container exit.

Pros and cons of this approach#

Using Docker for local builds like this has a few distinct pros and cons.

Pros:

Everyone already has Docker installed
Casual developers don’t need to install any dependencies or configure their environment
Guaranteed to get the same result we use for production builds

Cons:

It’s slower for repeat builds than a non-docker solution (although you can easily change that by getting rid of --rm)
Doesn’t work for older Go versions that don’t have an image on Dockerhub (currently only 1.24 and 1.25 are maintained there, and it’s a moving target)

We think this is a reasonable set of trade-offs to give customers the ability to build from source on an unreleased branch or their own fork. Advanced users will be able to figure out the dependency requirements and set the appropriate build flags themselves, just like our engineers do. But casual users will still be able to get the same production build process we use for releases entirely locally, with no set-up work.

Conclusion#

Want to talk about cgo builds? Or maybe you’re curious about Dolt or Doltgres, the world’s first version-controlled SQL databases? Come by our Discord to talk to our engineering team and meet other Dolt users.

Blog

Using Docker to deal with cgo build complexity

Writing a cross-platform build script to run in Docker#

Putting it all together#

Running in Docker#

Pros and cons of this approach#

Conclusion#

Get started with Dolt