Remote Authentication Explained

8 min read

Dolt is the world’s first SQL Database with Version Control features. It’s like MySQL and Git had a baby.

Git lacks authentication because there is no server that is Git-specific. All of the things we use to host Git repositories, GitHub, GitLab, BitBucket, and so on, all intercept traffic in the middle of git push and gate users at that stage.

Dolt deviated from Git in this regard. Our data-sharing strategy included the creation of Dolthub.com, which required users to create credentials to push and pull. Later, when we decided to support dolt sql-server, we had to enable authentication to be a credible drop-in replacement for MySQL. The two authentication mechanisms differ and sometimes collide. That’s what we are going to discuss today.

Authentication Intro

Authentication?#

Always good to make sure we’re on the same page with terms. When we need to perform an action, such as updating the data of your database, we want to ensure that the individual or process is allowed to do so. This is called Authorization. Authorization is impossible if you cannot determine who is attempting to make the change in question. The only Authorization questions you can answer if you don’t know who you are talking to are: “World Writable?” and “World Readable?”

We generally never want anything to be World Writable. Even within the confines of a corporate intranet where security concerns seem low, you will want to know the name of the person who dropped all of your tables by mistake.

Authentication is the art of “Who” in this equation. The identity of the user or process is a necessary first step before you can determine if they can make the action in question.

There are many ways to authenticate users in general. Username/Password is the simplest—the user knows a secret. OAuth is an open standard for delegated authentication where a third party tells you who the user is. SAML is similar and possibly a more corporate solution. There are SSH Keys, Client Side Certificates, Tokens, and many other Authentication mechanisms.

DoltHub and DoltLab#

DoltHub.com was the first way we enabled people to share Dolt databases. The landscape was pretty simple: users were required to have the Dolt CLI installed, and they pushed and pulled against Dolthub.com. Anyone familiar with GitHub will be comfortable in this mode.

DoltLab too! DoltLab is like taking Dolthub.com, wrapping it up with a bow, and deploying it to your on-premise hardware. While DoltLab didn’t exist in the early days, everything I say here pertains to DoltLab today. It’s the same code. Anywhere we say “Dolthub.com” here, you can swap in DoltLab if that is what you’re working with.

The Dolt CLI has two commands that pertain to authentication against Dolthub.com: dolt login and dolt creds. The login command was recently discussed in another blog. Honestly, it’s just sugar on top of the dolt creds new command, but I digress. The point is that these two commands are all about Authenticating to DoltHub and DoltLab instances.

The result of dolt login or dolt creds new is to place a secret—a token—on your local machine, which serves as your identity. When you set it up, you will be directed to a webpage where you specify the Token ID you want to associate with your account. This page is familiar if you’ve run dolt login before.

DoltHub New Login

It’s worth calling out that the webpage you connect to has authenticated you as well. On Dolthub.com, you can assert your identity using Google, GitHub, or an email/password. For DoltLab we can also authenticate with OIDC. The point is that the trust that Dolthub has that you are the owner of that token is possible because you were authenticated to the website at the time you added your key.

Once you have created your personal token, dolt clone, dolt pull, and dolt push will use that token whenever it talks to a remote endpoint with the correct host. The token is stored in a location in your home directory, so it’s not associated with your local database copies.

With all of that, the invocation of commands like dolt clone will simply grab the secret from your local disk drive and authenticate with the remote using that token.

Simple Token

`sql-server` As A Go Between#

There is another use of this token which is important to highlight: dolt sql-server will use your tokens as well. dolt sql-server is how we start our MySQL compatible server. You can read about configuring and running your server in our documentation. sql-server has the ability to clone and push just like on the CLI, but with the use of SQL stored procedures. When you run a sql-server instance and connect to it, then attempt to clone with the following SQL:

CALL DOLT_CLONE('timsehn/media_wiki');

The sql-server instance will recognize that timsehn/media_wiki is a shorthand for Dolthub.com, and when it clones the database, it will use the token that it would have used if you had simply used the CLI dolt clone command.

Indirect Token

Danger! sql-server will use your tokens for anyone who can connect to the process. If you run a sql-server instance on any host that has credentials created by dolt login, you are effectively giving your users your keys. Well, the ability to use the keys even if they can’t see them directly.

Bad Actor with Token

There are legitimate situations where having a sql-server as a go between makes sense. There are some Dolt users who run a single user sql-server locally so that their custom application has all the capabilities of a SQL server at their disposal. Only after they are done with their local changes do they push to a common shared database for review.

`sql-server` and Hosted Instances#

The second type of Authentication is Username/Password against a running instance of dolt sql-server. Everything said here goes for Hosted Dolt Instances, too. Hosted instances are simply instances of dolt sql-server that we manage on your behalf. You are welcome to run a sql-server locally, but we’ll discuss here like it’s always running on another host.

dolt sql-server, similar to MySQL, has the ability to track users and their access permissions. As an example, if you connect to your server and run the following:

CREATE USER 'alice'@'%' IDENTIFIED BY 'ApplesAndBananas';

A new user will be created. alice will be able to connect to the host and provide ApplesAndBananas as their password. They can’t really do anything because they have no permissions yet. If we just want to make Alice super powerful, we can grant all:

GRANT ALL PRIVILEGES ON mydb.* TO 'alice'@'%';

When using Dolt’s SQL shell, Alice connects using a command like the following:

$ dolt --host "yourdb.hosted.doltdb.com" --user "alice" --password "ApplesAndBananas" sql

This will open a TLS connection to the host and connect Alice to the host. This is probably the simplest model of them all. Especially since there isn’t really any ability to push and pull to another source. You can add remotes, but it’s only possible to use remotes which require no Authentication (eg, World readable). In this model, you are using all the branching and merging features of Dolt, but they are contained entirely within the server instance. A perfectly reasonable approach - it’s what we did before we could push and pull!

Simple PWD

What if you need to push and pull though? To push and pull from the sql-server instance to DoltHub/DoltLab requires you to start the instance on a host where you have credentials setup. We covered this above. To push and pull to another sql-server instance, using the first as the go between requires a few steps.

First, consider you have two hosts running dolt sql-server. The first is yourdb and the second is theirdb. As just described you connect to yourdb with the userid and password, just like you would a regular MySQL server. The trick to getting yourdb to talk to theirdb is you need to start yourdb with a special environment variable which specifies the password the process will use to connect to theirdb

$ export DOLT_REMOTE_PASSWORD=<pwd to theirdb>

The userid that will be used to connect to theirdb is the same userid as you connected to yourdb. Currently there is no way to override that, but it’s a very reasonable request. File an feature request if you’d like that. We go into more detail on how to set up this scenario here.

PWD Passthrough

Danger! The same risk as above exists here. If you have multiple people connecting to yourdb, and they have the permissions to call dolt_push(), then when they push the password they will be using is the one set with the DOLT_REMOTE_PASSWORD environment variable.

Similar to above, there are legitimate use cases to do this. Running a sql-server instance locally that needs to push to another sql-server is reasonable, but take care to ensure that yourdb is yours alone - not a multi-user server.

Local sql-server#

There is a final way to access a Dolt server. When you have access to the data files managed by Dolt, you effectively have super user access. If you attempt to connect to your sql-server instance while in the directory of your data, you will automatically connect with a super user. This directly relates to how we expect people to use Dolt servers to talk to other servers. The concerns about yourdb being yours alone is pretty easy when you are using an instance on your local machine. It fits with the single user approach because there is only one user with access to the token or password to talk to outside instances.

Local Server

To pull it all together, to push/pull to DoltHub.com and DoltLab instances, use dolt login to create a token. To push/pull to a dolt sql-server or Hosted instance, use the DOLT_REMOTE_PASSWORD to specify your password. You will get best results if you use a local sql-server where you are the sole user.

Conclusion#

Dolt is a little different from other databases. Your data doesn’t just sit in one machine—it can be pushed and pulled to other locations. Our Authentication story to do so is simple and has room for expansion. We like building things for real users with real problems, so come to our Discord server and tell us what you’d like to see!

Blog

Authentication?#

DoltHub and DoltLab#

sql-server As A Go Between#

sql-server and Hosted Instances#

Local sql-server#

Conclusion#

Get started with Dolt

`sql-server` As A Go Between#

`sql-server` and Hosted Instances#