When to make a Dolt Commit

REFERENCE
5 min read

When should you commit? It's a question as old as time, dating back to the original init commit.

Proposal

But when should you make a Dolt commit? That's a much lighter weight decision. This blog will give you some advice on when to make a Dolt commit.

What is a Dolt Commit?

A Dolt commit signals to Dolt that you would like to save the state of the current database permanently for future reference. Making a Dolt commit is a fast, lightweight operation.

In practice, all the state and data needed to make a commit already exists, the act of making commit just writes commit metadata to a log. This stores the root hash (or reference) of the database in a graph of all the commits with a link to its parent commit. If the commit is a merge commit, the commit will have multiple parents.

Dolt Commit Graph

Commit hashes are SHA-256 encoded hashes of the entire database. Commit hashes look like t5d5inj4bpc1fltrdm9uoscjdsgebaih. These are abbreviations of the entire hash that Dolt understands. When referring to a specific commit, this is the identifier you use.

A Dolt commit is different from a standard SQL transaction commit. Dolt supports both which can be a bit confusing.

Note, if you never make a Dolt commit, Dolt will look and work exactly like a traditional MySQL database. As you makes writes using SQL transactions, those changes are persisted to disk in what is called the "working set". You query the working set as you would a MySQL database. No additional versioning metadata is ever stored. So, another way to think about a Dolt commit is that a traditional database like MySQL or Postgres is just Dolt without commits.

What do you use Dolt Commits for?

Dolt uses commits as the basis of comparison between two versions of a database. You can ask Dolt to do things like:

  • Show me the differences between these two commits
  • Give me the common ancestor of these two commits
  • Make my current database look like this commit
  • Show me the difference between my current database and the last commit
  • Show me all the commits since this database was created
  • Show me who created this commit and the message they left when he or she made it
  • Revert or undo the changes made in a specific commit
  • Start a new branch from this version
  • Apply the changes made in this commit to another branch
  • Prepare this branch to be merged into another branch
  • Push this version of the database to a remote

You should make a commit when you want to be able to use the current version of the database to do one of the above things.

How do you create a Dolt Commit?

You can create a Dolt commit manually or on every SQL transaction commit.

Manually

You can make a Dolt commit from the command line or in SQL. The options follow the Git commit standard.

dolt commit -am "This is a commit"

In SQL, you use the dolt_commit() procedure. The arguments are passed in the command line fashion. Just replace the spaces with commas.

call dolt_commit("-am", "This is a commit");

You must dolt add new tables you create individually or by using the . or -A options for the tables to be tracked by Dolt.

On Every Transaction Commit

Additionally, in the SQL server context, you can set the dolt_transaction_commit variable and every SQL COMMIT will become a a Dolt commit.

SET PERSIST sqlserver.global.dolt_transaction_commit=1

After making a commit, you can see you get a generic commit message by the user who is running the database.

$ dolt log -n 1
commit u4shvua2st16btub8mimdd2lj7iv4sdu (HEAD -> main) 
Author: Tim Sehn <tim@dolthub.com>
Date:  Mon Jul 11 15:54:22 -0700 2022

        Transaction commit

When should you create a Dolt Commit?

When using Dolt as Git for Data, it's fairly obvious when to create a Dolt commit. Whenever you would make a Git commit, like when you want to push or when you want to annotate your changes, you similarly make a Dolt commit.

When using Dolt as a version controlled database, when to create a Dolt commit is not as obvious. No analogous SQL concept exists. There are transaction COMMITs but those are not quite the same as Dolt commits. The closest analogy in traditional databases is backups. Whenever you would make a backup in a traditional SQL database, you should definitely make a Dolt commit. But commits are far lighter weight than backups so you probably want to make Dolt commits much more frequently.

There are a few ways to think about Dolt commit creation in this context:

  1. Administrator Actions
  2. Application Defined
  3. Every Transaction
  4. On a Schedule

Administrator Actions

If you are manually modifying the database, say for a schema migration, it makes sense to make a commit before and after your changes. Also, you may want to do these actions on a separate branch and merge your changes into the production branch when you're sure you've done them correctly. This means you can rollback to the point before the changes or use the commit to cherrypick into other branches.

This is the classic case for database version control. Your main production branch can work exactly like a traditional MySQL database and you can use the version control features to prevent human error on administrative actions. As you become more sophisticated, you'll start to notice that more and more operations, like updating domain tables, can take advantage of version control using commits.

Pros

  1. Protection from human error
  2. You application code doesn't need to know about Dolt commits

Cons

  1. Version control won't protect against bad application logic

Application Defined

For some applications, the application flow generates natural times to create a commit. For instance, if a user finishes a flow and wants to "save their changes". This is a natural time to create a commit. This is usually less frequently than every committed transaction to the database. The commit contains a large, segmented batch of writes.

Here are some examples of application defined commit points:

  • new user or account creation
  • the completion of an order
  • a user makes or accepts a friend request
  • a user deletes something
  • a user creates a piece of content

As you can see, commits logically fit after large batches of write operations to the database. Completion of these tasks all act as logical comparison points or revert points which are the purpose of a commit.

This option also allows the application to annotate commits with a commit message to help figure out what has happened later.

Pros

  1. Creates a logical commit graph
  2. Maps most cleanly to source code version control

Cons

  1. You need to modify your application code to understand Dolt commits
  2. You lose some change granularity because transactions are grouped into commits

Every Transaction

As seen above, Dolt offers the option to make a Dolt commit on every transaction COMMIT. This is the most safe option as every transaction is going to be available in the Dolt commit history. Unfortunately, you lose the ability to annotate commits. Every commit is labeled "Transaction commit". This will make your commit log less usable for humans. But if you only intend to access commits programmatically, this is a good option.

Pros

  1. You application code doesn't need to know about Dolt commits
  2. You can rollback to or cherrypick any transaction in history

Cons

  1. You lose the ability to annotate commits with a custom message
  2. There will be a lot of commits, potentially making the commit graph hard to use

On a Schedule

You can also create a commit on a schedule like every evening or every hour. This treats commits more like backups and less like a concept accessible via the application. You can also set up commits to be created after a certain number of transactions.

You could mix this strategy with also creating commits when performing administrator actions as discussed above. Make a commit before an administrator starts making changes to the running database, just in case.

Pros

  1. Maps cleanly to traditional database backups
  2. Your application code doesn't need to know about commits

Cons

  1. Version control won't protect against bad application logic
  2. Your commit history is very course, not granular

Conclusion

If you like it then you should have put a ring on it. - Beyoncé

If you like it, make a Dolt commit. Lock that moment in immutable history.

Ready to commit to Dolt? Come chat with us on our Discord.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.