Dolt Concurrent Transaction Example

REFERENCE
6 min read

Dolt is the world's first version controlled SQL database. Dolt is built on a novel storage engine that makes diffs and merges fast.

When committing a SQL transaction, Dolt uses the same merge logic you invoke on a dolt merge between branches. This produces some subtle differences in the execution of concurrent transactions than MySQL or Postgres. We go into great detail in Neil's excellent blog article on the topic.

Recently, we received a GitHub issue with a simple, salient example using a counter showing the difference. This article works through the example and explains why Dolt and MySQL produce different results.

Lost Update Poster

Example

We're going to create a table with two columns, a primary key id column and a counter column called value, both integers. We'll add a single row with values (0,0). Then, we'll attempt to increment the counter column for that row by one concurrently 10 times.

MySQL

First, let's look at MySQL. Feel free to follow along if you have MySQL on your machine. I have MySQL running on port 3306 on my machine.

I create a new database and table to the specification above. Then I insert a single row.

$ mysql -e "create database counter_example"
$ mysql counter_example -e "create table counter (id int primary key, value int)"
$ mysql counter_example -e "insert into counter values (0,0)"        

Now, I make a bash script that executes ten concurrent updates to the counter column.

$ cat mysql-update.sh           
set -e
set -o pipefail

(mysql counter_example -e "update counter set value = value+1 where id=0" &)
(mysql counter_example -e "update counter set value = value+1 where id=0" &)
(mysql counter_example -e "update counter set value = value+1 where id=0" &)
(mysql counter_example -e "update counter set value = value+1 where id=0" &)
(mysql counter_example -e "update counter set value = value+1 where id=0" &)
(mysql counter_example -e "update counter set value = value+1 where id=0" &)
(mysql counter_example -e "update counter set value = value+1 where id=0" &)
(mysql counter_example -e "update counter set value = value+1 where id=0" &)
(mysql counter_example -e "update counter set value = value+1 where id=0" &)
(mysql counter_example -e "update counter set value = value+1 where id=0")

Now, I run the script and observe the result.

$ bash mysql-update.sh
$ mysql counter_example -e "select * from counter"           
+------+-------+
| id   | value |
+------+-------+
|    0 |    10 |
+------+-------+

As you can see, MySQL manages to update the table ten times under concurrency producing the value 10. This is probably what most people expect.

Dolt

Now, let's look at what Dolt does. I use Dolt's CLI to initialize the same database as MySQL, a single table and single row. I make some Dolt commits, not to be confused with transaction commits, just in case I want to rollback to those points in the future. At the end, I create a user because that is the default MySQL user and I want to use the same script I used for MySQL.

$ mkdir counter_example
$ cd counter_example 
$ dolt init --fun
Successfully initialized dolt data repository.
$ dolt sql -q "create table counter (id int primary key, value int)"
$ dolt commit -Am "Created counter table"
commit 8gipo4m3sfhrar23m79jbh7kmil67aqb (HEAD -> main) 
Author: timsehn <tim@dolthub.com>
Date:  Tue Dec 12 13:37:35 -0800 2023

        Created counter table

$ dolt sql -q "insert into counter values (0,0)"
Query OK, 1 row affected (0.00 sec)
$ dolt commit -am "Initialized counter at 0"
commit cq8b560aq42bcdkd2u41s2kp19bic3oc (HEAD -> main) 
Author: timsehn <tim@dolthub.com>
Date:  Tue Dec 12 13:38:12 -0800 2023

        Initialized counter at 0

$ dolt sql -q "create user timsehn; grant all on *.* to timsehn"
$ dolt sql-server

Like MySQL, I run the concurrent update script and observe the results.

$ bash mysql-update.sh  
$ dolt sql -q "select * from counter"
+----+-------+
| id | value |
+----+-------+
| 0  | 2     |
+----+-------+

WTF! Dolt missed eight updates. Or did it? This is in fact, expected behavior. Let me explain.

WTF?

What's going on?

As I said in the opener, Dolt uses the same merge logic when committing transactions as it does when you invoke dolt merge between branches. What is this merge logic?

When merging, Dolt looks at the values of a row, column pair (ie. cell) identified by primary key. If two branches or transactions modify different cells, the merge succeeds without conflicts. If two branches or transactions modify the same cell to different values, there is a merge conflict and the merge fails. In the case of dolt merge the user has the chance to manually fix the merge. In the case of transactions, the commit fails and the transaction is rolled back. Finally, and importantly for this example, if the two branches modify the same cell to the same value, the merge succeeds without conflicts.

In MySQL, the default transaction isolation level is READ COMMITTED. Transaction isolation is implemented with row-level locks. If a transaction acquires a lock on a row, other transactions that want a lock on that row wait for the lock to be released and then execute. Which rows get locked based on the transaction gets complicated.

Let's dive into each of the three states described in dolt merge and see how these are different than MySQL.

If two branches/transactions modify different cells the merge succeeds without conflicts.

Dolt and MySQL produce the same results, though the mechanism is different.

For transactions that modify different rows, both engines can execute those updates in parallel. In Dolt, transactions that modify the same row but different columns in the row can be committed and will merge without conflicts. In MySQL, the second concurrent transaction on a row will queue and execute the second update on the new results.

If two branches/transactions modify the same cell to different values, there is a merge conflict and the merge fails.

Dolt and MySQL do something different here.

In Dolt, this second transaction fails because there are conflicting writes. In MySQL because of row-level locking, the second transaction overwrites the first one, in a "last write wins" fashion. To disable this behavior in MySQL, you use SELECT FOR UPDATE. If the row is locked by another transaction, your transaction will fail. Dolt does not implement SELECT FOR UPDATE because this is the default behavior.

If the two branches/transactions modify the same cell to the same value, the merge succeeds without conflicts.

Again, Dolt and MySQL produce the same results though the mechanism is different.

In Dolt, if two transactions update a cell to the same value, the transaction succeeds. This case is impossible in MySQL under row-level locking. If two transactions update the same cell to the same value in MySQL, two updates are performed, the second update is just a no-op.

Back to the Example

Now that you understand that MySQL transactions depend on row-level locking it is easy to see how MySQL was able to process every update. Each transaction queued until the previous one was done, read the current value, and incremented the counter.

What Dolt did is a bit more complicated. Let's inspect the trace logs for the example above. The trace logs are very verbose so they are linked here.

The sequence of events illustrated in the logs is as follows:

  1. Eight Connections (3 through 10 connection IDs) open immediately and issue their update queries.
  2. The Server opens the last two connections (IDs 11 and 12) before receiving all the queries from the first eight connections.
  3. Merges begin to happen for all connections, notably connection with IDs 9 and 10 happen before ID 7.
  4. Connection ID 7 fails with a merge conflict indicating its merge base value is 0, incremented to 1, but when it tried to commit the value was already 2. Note this is a warning in the logs, not a failure.
  5. Connection IDs 11 and 12 finish updating the value from 1 to 2.

As you can see, the final value of the value column is 2.

$ dolt sql -q "select * from counter"
+----+-------+
| id | value |
+----+-------+
| 0  | 2     |
+----+-------+

Lost Update Problem

Readers familiar with transaction isolation levels will know that the example used in this article is an acute case of the "Lost Update" problem. Dolt is more susceptible to lost updates because Dolt uses merge to handle transactions rather than locks.

Lost Update

Dolt is REPEATABLE READ not SERIALIZABLE.

Transaction Isolation Levels

Conclusion

As you can see from this simple concurrently updating counter example, Dolt has different transaction handling semantics than MySQL. Dolt uses dolt merge logic to process concurrent transactions rather than row-level locking. As such, Dolt is more susceptible to the "lost update" problem. It is on our roadmap to implement row-level locks so users can mimic MySQL transaction behavior if needed. Stay Tuned! Want to upvote row-level locks? Come chat with us on our Discord.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.