Automatic Dolt Commits with @@dolt_transaction_commit

SQL
10 min read

Dolt is the world's first fully-versioned SQL relational database. It enables you to branch, fork, diff, and merge your tables, and to time travel to any point in your data's history and examine how your data and schema have changed over time. Customers come to Dolt because they want to take advantage of the versioning features Dolt provides. Some concrete examples of what customers are doing with those versioning features include developing video games, providing data/configuration versioning as an application feature, managing ML data for model training, and much, much more.

But... today I actually want to talk about how Dolt supports applications that don't care about Dolt's versioning features!

Wait... what?!

Hear me out... in the use cases I mentioned above, all of those applications are "version-aware". That means they know that Dolt is creating versions of their data and they use Dolt's data versioning features directly. For example, they execute SQL queries like call dolt_commit(), or select * from dolt_diff(...) as part of the application code. However, there's another large class of applications that just need to create data and read the current data. These types of applications don't actually need to directly use Dolt's versioning features, but they can still benefit from having a recorded history of all the data and schema changes happening in the database.

For example, imagine if your Metabase or Wordpress install was using a Dolt database to store all of its configuration data. Even without those applications being "version-aware", they can work with Dolt just like any other MySQL database, and Dolt can automatically create Dolt commits as the data changes. From the application's perspective, Dolt looks just like any other MySQL database, but underneath, Dolt is diligently recording how all the data is changing. If someone accidentally messes up important application data, you can examine the history, see exactly what changed, and take steps to restore your data. Or, maybe you want to run some manual, ad-hoc analytic queries on your database to understand what configuration data is changing the most, or audit who changed what in a certain time period. Dolt makes all of this possible, without any changes to your application code to make it "version-aware".

Introducing @@dolt_transaction_commit

This behavior of implicitly creating Dolt commits can be easily enabled by turning on the @@dolt_transaction_commit system variable. By default, Dolt assumes that you want to explicitly control when Dolt commits are created – you stage your changes with dolt_add(), then call dolt_commit() when you're ready to record the current state as a new Dolt commit in your database's history. This should sound familiar, since it's the same model used by Git. However... this requires that your application is aware of these Dolt features and uses the commands above to explicitly create Dolt commits. Dolt provides another mode where Dolt commits are created implicitly when you enable @@dolt_transaction_commit. In this mode, whenever a SQL commit is processed, a Dolt commit is automatically created. There are a lot of advantages for explicitly controlling the creation of Dolt commits (e.g. controlling the exact commit message and scope of the commit), but in cases where you can't easily update the application code to be "version-aware", the @@dolt_transaction_commit mode is a great way to start integrating with Dolt and getting the benefits of having all your data and schema changes versioned, tracked, and auditable.

Using @@dolt_transaction_commit

Now that you know what the @@dolt_transaction_commit system variable does, let's show a few easy ways to turn it on...

Explicitly setting @@dolt_transaction_commit

First up is the explicit and obvious route... since @@dolt_transaction_commit is a system variable, you can turn it on or off directly with SET @@dolt_transaction_commit=1; just like any other system variable. Like many system variables, @@dolt_transaction_commit is available both at a global level and at the individual session level. You can change the global setting with SET @@global.dolt_transaction_commit=1;, or you can omit global and set it for each session you want to use it in. Note that if you set the global setting, only new sessions started after changing the global variable will have the new setting – existing sessions will not be updated.

set @@dolt_transaction_commit=1;
select @@global.dolt_transaction_commit, @@session.dolt_transaction_commit;
+----------------------------------+-----------------------------------+
| @@GLOBAL.dolt_transaction_commit | @@SESSION.dolt_transaction_commit |
+----------------------------------+-----------------------------------+
| 0                                | 1                                 |
+----------------------------------+-----------------------------------+
1 row in set (0.00 sec)

It's also worth noting that global variables are not persisted across server restarts, so when you restart your sql-server you'll need to apply the setting again. If you're running and managing your own Dolt sql-server and want to always use this setting for all connections, we recommend setting it in your sql-server configuration file so that you don't have to remember to turn it back on each time you start up your sql-server. The example below shows a sample dolt sql-server configuration file with the dolt_transaction_commit behavior enabled.

log_level: info

behavior:
    dolt_transaction_commit: true

user:
    name: "account1"

listener:
    host: localhost
    port: 3306
    max_connections: 100

Configuring in Hosted Dolt

If you're running in Hosted Dolt, there's a very easy behavior setting that controls @@dolt_transaction_commit globally. This approach is equivalent to setting the system variable globally in your sql-server configuration file, as described above, but the advantage is that it's in the Hosted Dolt UI and provides a really easy way to enable this behavior. The same note above about this value only affecting new sessions applies here, too, so be sure to restart any sessions after turning this on the first time. The screenshot below shows where to find the @@dolt_transaction_commit setting on the "Configuration" tab of a Hosted Dolt deployment:

Hosted Dolt Configuration: @@dolt_transaction_commit behavior

Configuring via JDBC connection properties

This last one is Java-specific, but it's a pretty slick feature of the MySQL Connector/J JDBC driver so I thought it was worth calling out. If you are using the MySQL Connector/J JDBC driver to connect to Dolt, the sessionVariables connection property is another really handy way to enable @@dolt_transaction_commit (or any other session variable for that matter!). The big advantage here is that you can embed this setting into the JDBC URL you give your application so that every connection it makes will always have @@dolt_transaction_commit enabled, but other sessions not using that JDBC URL will not have @@dolt_transaction_commit enabled. This is specific to Java applications that allow you to configure a JDBC URL, but it's particularly useful if one part of your system doesn't need to be "version-aware" and you have other manual or automated processes that do want to explicitly control when Dolt commits are created. The code snippet below shows how to set @@dolt_transaction_commit through the sessionVariables property when creating a connection through the MySQL Connector/J JDBC driver:

String url = "jdbc:mysql://127.0.0.1:3306/myDb?sessionVariables=dolt_transaction_commit=1";
conn = DriverManager.getConnection(url, user, password);

Demo

Now that we know what @@dolt_transaction_commit does, why it's useful, and a few ways to enable it, let's use it and take a look at the Dolt commits that it creates.

Show. Me. The. Commits!

For this demo, I've used Hosted Dolt to launch a new Dolt database and after it started up, I enabled the behavior setting for @@dolt_transaction_commit as described above. Then I grabbed the connection information from the "Connectivity" tab in Hosted Dolt, and used the MySQL client to connect to my Hosted Dolt SQL server. To start, let's confirm a few of our settings:

SELECT dolt_version(), @@autocommit, @@global.dolt_transaction_commit, @@session.dolt_transaction_commit;

+----------------+----------------------+----------------------------------+-----------------------------------+
| dolt_version() | @@SESSION.autocommit | @@GLOBAL.dolt_transaction_commit | @@SESSION.dolt_transaction_commit |
+----------------+----------------------+----------------------------------+-----------------------------------+
| 1.5.0          |                    1 |                                1 |                                 1 |
+----------------+----------------------+----------------------------------+-----------------------------------+
1 row in set (0.00 sec)

Perfect; we can see that the Hosted Dolt behavior setting we changed for the @@dolt_transaction_commit is working and this behavior is enabled globally, including in our current session. Let's check out the current Dolt commit log to verify that we only have the initial commit that all new Dolt databases start with:

SELECT * FROM dolt_log;

+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
| commit_hash                      | committer | email                   | date                    | message                    |
+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
| bml0np84r21bsmrl9eelpuniaanl3195 | jfulghum  | jason@dolthub.com       | 2023-06-20 23:50:33.623 | Initialize data repository |
+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
1 row in set (0.01 sec)

Great! A clean slate. Let's create a table and insert some data into it:

CREATE TABLE myTable (id INT PRIMARY KEY, name VARCHAR(255));
INSERT INTO  myTable VALUES (1, 'Jerry Maguire');

Now let's check out the Dolt commit log again and see what we've got:

SELECT * FROM dolt_log;

+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
| commit_hash                      | committer | email                   | date                    | message                    |
+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
| ovbl7vvobimer2f5ighrvgpmt9qks82b | jfulghum  | jason@dolthub.com       | 2023-06-20 23:57:58.647 | Transaction commit         |
| 3lst872g0hvc6g8107ntmd0rc3lqfi3f | jfulghum  | jason@dolthub.com       | 2023-06-20 23:57:52.797 | Transaction commit         |
| bml0np84r21bsmrl9eelpuniaanl3195 | jfulghum  | jason@dolthub.com       | 2023-06-20 23:50:33.623 | Initialize data repository |
+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
3 rows in set (0.00 sec)

Each of the two SQL statements we executed created their own Dolt commit. This is because we have the @@autocommit system variable enabled, so every statement is implicitly its own SQL transaction unless we explicitly use START TRANSACTION to create explicit transactions.

Let's quickly talk about how @@autocommit interacts with @@dolt_transaction_commit. These two system variables are closely related – @@autocommit controls whether SQL commits are automatically created for you after each statement you execute and @@dolt_transaction_commit controls whether a Dolt commit is automatically created for each SQL commit. By default, @@autocommit is enabled for each session, so if you turn on @@dolt_transaction_commit, then every statement you execute that edits data will create a SQL commit and also a Dolt commit. If @@autocommit is disabled, then just remember that you'll have to explicitly manage SQL commits by executing a SQL commit statement, and whenever you do, if @@dolt_transaction_commit is enabled, it'll automatically create a Dolt commit, too.

Let's turn off @@autocommit and show how explicit SQL transaction management works. Technically, we could still do this with @@autocommit enabled as long as we're explicitly using START TRANSACTION, but for the sake of this demo, let's just take @@autocommit completely out of the picture.

SET @@autocommit = 0;

SELECT @@autocommit;
+----------------------+
| @@SESSION.autocommit |
+----------------------+
|                    0 |
+----------------------+
1 row in set (0.00 sec)
    
START TRANSACTION;
INSERT INTO  myTable VALUES (2, 'Dorothy Boyd');
INSERT INTO  myTable VALUES (3, 'Rod Tidwell');
INSERT INTO  myTable VALUES (4, 'Bob Sugar');
COMMIT;

Now when we look at the Dolt commit log, we should only see one new Dolt commit, since we explicitly included all three of our insert statements into a single SQL transaction.

SELECT * FROM dolt_log;

+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
| commit_hash                      | committer | email                   | date                    | message                    |
+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
| evnjujc7c9cgn8if43eiii76pvavj3fr | jfulghum  | jason@dolthub.com       | 2023-06-20 23:59:37.196 | Transaction commit         |
| ovbl7vvobimer2f5ighrvgpmt9qks82b | jfulghum  | jason@dolthub.com       | 2023-06-20 23:57:58.647 | Transaction commit         |
| 3lst872g0hvc6g8107ntmd0rc3lqfi3f | jfulghum  | jason@dolthub.com       | 2023-06-20 23:57:52.797 | Transaction commit         |
| bml0np84r21bsmrl9eelpuniaanl3195 | jfulghum  | jason@dolthub.com       | 2023-06-20 23:50:33.623 | Initialize data repository |
+----------------------------------+-----------+-------------------------+-------------------------+----------------------------+
4 rows in set (0.00 sec) 

And sure enough, because we grouped our three SQL statements into a single SQL transaction, @@dolt_transaction_commit created a single Dolt commit, too. We can take a quick look at the diff of that commit using the dolt_diff() table function and confirm that it contains all three of our new rows. dolt_diff() takes a starting and ending revision (a commit, branch, or tag) and a table name and returns a result set that describes the rows that changed in that revision range. Note that here, we're using an ancestor spec (~) to easily refer to the parent commit of the commit we want to diff:

SELECT * FROM dolt_diff('evnjujc7c9cgn8if43eiii76pvavj3fr~', 'evnjujc7c9cgn8if43eiii76pvavj3fr', 'myTable');
+-------+--------------+----------------------------------+-------------------------+---------+-----------+-----------------------------------+-------------------------+-----------+
| to_id | to_name      | to_commit                        | to_commit_date          | from_id | from_name | from_commit                       | from_commit_date        | diff_type |
+-------+--------------+----------------------------------+-------------------------+---------+-----------+-----------------------------------+-------------------------+-----------+
|     2 | Dorothy Boyd | evnjujc7c9cgn8if43eiii76pvavj3fr | 2023-06-20 23:59:37.196 |    NULL | NULL      | evnjujc7c9cgn8if43eiii76pvavj3fr~ | 2023-06-20 23:57:58.647 | added     |
|     3 | Rod Tidwell  | evnjujc7c9cgn8if43eiii76pvavj3fr | 2023-06-20 23:59:37.196 |    NULL | NULL      | evnjujc7c9cgn8if43eiii76pvavj3fr~ | 2023-06-20 23:57:58.647 | added     |
|     4 | Bob Sugar    | evnjujc7c9cgn8if43eiii76pvavj3fr | 2023-06-20 23:59:37.196 |    NULL | NULL      | evnjujc7c9cgn8if43eiii76pvavj3fr~ | 2023-06-20 23:57:58.647 | added     |
+-------+--------------+----------------------------------+-------------------------+---------+-----------+-----------------------------------+-------------------------+-----------+
3 rows in set (0.01 sec)

The results show that in that single commit (evnjujc7c9cgn8if43eiii76pvavj3fr), three new rows were added to the myTable table. The diff_type field shows us that each row was added (not deleted or modified) and the to_id and to_name fields show us the values in the new rows. Because these rows were all new, from_id and from_name are NULL, but if this had been a DELETE or UPDATE statement, those fields would have the values of the rows before they were deleted or updated.

Conclusion

@@dolt_transaction_commit is a handy system variable that automatically creates a Dolt commit for every SQL commit, either implicitly through @@autocommit or by explicitly executing a SQL commit statement. This enables applications to treat Dolt like any other MySQL database but still have Dolt build a versioned history of your data and schema.

With this setting, even if your system isn't "version-aware", you can still get a lot of benefit from using Dolt as your primary database. With @@dolt_transaction_commit, any existing application can automatically build a versioned history of its data and schema changes, without having to make any changes to the application code. All of that versioned history of your data is there for you when you need it. Need to log in and run some one-off queries to restore data that was damaged? That's easy with Dolt. Need to audit how all the data managed by a third-party application has changed over time? Just hook it up to Dolt and let Dolt version all the data and schema changes, and you can easily audit how every row of your data has changed since each one was first created.

If you haven't tried out Dolt yet, give it a shot! The @@dolt_transaction_commit system variable is an easy way to test Dolt with an existing application since you don't have to make any code changes to start building a versioned history. If you have any questions or hit any issues with Dolt or with Dolt's MySQL compatibility, come talk to us on Discord or shoot us an issue on GitHub and we'll be happy to help!

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.