Database Performance: Dolt vs MySQL

December 4, 2020

5 min read

Dolt is a version controlled SQL database. Dolt's query interface is SQL, and it has Git-like version control features. Adding version control features to a SQL database has performance trade offs when comparing Dolt with traditional databases like MySQL. In particular relational databases use highly optimized storage layouts that are motivated query engine performance. Implementing version control features requires making compromises about the storage layout that impact query performance. We believe that the explosion of data centric business processes and applications justifies a database that makes this tradeoff.

The purpose of this post is to show how we measure those performance tradeoffs and set clear expectations on Dolt performance. Our goal is to get Dolt to the point where it is no more than 2-4 times slower than MySQL. We spend the balance of this post showing how we measuring ourselves against that goal.

Background

Database solutions, for example MySQL and Postgres, are designed to be application backing stores. More precisely they are optimized for online transaction processing (OLTP) use cases, defined loosely as as:

Processing in which the system responds immediately to user requests.

Dolt's version control features were designed with a different set of use-cases in mind, most of which are not OLTP. For example, we are currently onboarding a customer using Dolt as a backend for researcher output. They use Dolt's branching functionality to compare results submissions before combining those submissions to a master copy. We have other users who are using Dolt as an ETL ingestion point, ensuring that all third party data is robustly versioned. Dolt's commit graph acts as a "configurable boundary" for their organization, providing users with simple tools for controlling what data comes through that boundary.

Despite early adoption from non-OLTP use case, users still want fast query performance. Faster is obviously better. We think as Dolt gets faster, Dolt could make sense for some OLTP use cases that are willing to trade performance for version control features. We are really interested in supporting as many use cases as we can long term. Thus, we are committed to making Dolt as fast as possible.

Expectations

As stated at the outset, our goal is to get Dolt to the point where it is no more than 2-4 times slower than MySQL for sysbench's standard test suite of OLTP and non-OLTP tests. Additionally, we will implement custom tests that emphasize common use-cases for Dolt, which we expect to be no more than 2 times slower than MySQL. Finally, we consider anything that is 10 times slower than MySQL a bug that we will prioritize fixing.

Speed Multiple	Response
2-4	acceptable
4-10	not urgent, but needs work
10-20	bug
>20	urgent bug

Approach

We recently blogged about our approach to benchmarking Dolt, though our documentation is the place to go for the latest information. To recap, we use sysbench, an industry standard tool for benchmarking databases to evaluate both Dolt and MySQL using the same set of tests.

For example, our most recent release is 0.22.6, and we wanted to compare this to 0.22.2, the prior release. We ran the following command, noting that $DOLT_CHECKOUT is a checkout of Dolt's GitHub repository:

$ cd $DOLT_CHECKOUT/benchmark/perf_tools
$ ./run_benchmarks.sh all 100000 oscarbatori 0.22.2 0.22.6

At a high level, for each tag and MySQL this:

runs the database in a Docker container
runs sysbench in a separate Docker container
collects the results in a file

All the results, for MySQL and each Dolt tag, are associated with a single run ID which denotes an invocation of run_benchmarks.sh. This means that we can identify rows that were run on the same hardware at the same time, giving us a degree of hardware context isolation. For example, this run was associated with the unique run ID 7dffeeb22fb11efaec478ef3, which identifies the results:

$ ls -ltr output
-rw-r--r--  1 oscarbatori  staff  6846 Dec  4 11:13 7dffeeb22fb11efaec478ef3.csv

The all parameter denotes the following list of tests, though passing a comma separated list of tests also works for more targeted benchmarking:

bulk_insert
oltp_delete
oltp_insert
oltp_point_select
oltp_read_only
oltp_read_write
oltp_update_index
oltp_update_non_index
oltp_write_only
select_random_points
select_random_ranges

This is the complete list of standard sysbench tests. You can find these results posted to DoltHub, where you can run SQL against them.

Results

Here we present the results as they appear in the Dolt documentation:

Test	Dolt	MySQL	Multiple
bulk_insert	308783	2278717	7.4
oltp_delete	883	21847	24.7
oltp_insert	1302	6099	4.7
oltp_point_select	6401	32925	5.1
oltp_read_only	268	1894	7.1
oltp_read_write	90	1328	14.8
oltp_update_index	496	6199	12.5
oltp_update_non_index	803	5971	7.4
oltp_write_only	141	3335	23.7
select_random_points	457	4101	9
select_random_ranges	97	6109	63
mean			19.5

This analysis highlighted the select_random_ranges result as particularly bad, and it will in fact be fixed in the next release.

In the previous section we pointed at these results hosted on DoltHub. We can easily clone and analyze the results in SQL as follows to compare how performance has changed between the two most recent versions:

$ dolt clone dolthub/dolt-benchmarks && dolt-benchmarks
cloning https://doltremoteapi.dolthub.com/dolthub/dolt-benchmarks
185 of 185 chunks complete. 0 chunks being downloaded currently.
$ dolt sql
~/Documents/dolt-dbs/dolt-benchmarks/test/dolt-benchmarks|>>  dolt sql
# Welcome to the DoltSQL shell.
# Statements must be terminated with ';'.
# "exit" or "quit" (or Ctrl-D) to exit.
dolt_benchmarks> SELECT
              ->   dolt_0222.run_id as run_id,
              ->   dolt_0222.test_name as test_name,
              ->   dolt_0222.sql_transactions as dolt_0222,
              ->   dolt_0226.sql_transactions as dolt_0226,
              ->   mysql.sql_transactions as mysql,
              ->   ROUND((1.0 * mysql.sql_transactions) / dolt_0222.sql_transactions, 1) as dolt_0222_multiple,
              ->   ROUND((1.0 * mysql.sql_transactions) / dolt_0226.sql_transactions, 1) as dolt_0226_multiple
              -> FROM
              ->   sysbench_benchmark as dolt_0222
              ->   inner join sysbench_benchmark as dolt_0226
              ->     on dolt_0222.run_id = dolt_0226.run_id
              ->     and dolt_0222.test_name = dolt_0226.test_name
              ->   inner join sysbench_benchmark as mysql
              ->     on dolt_0222.run_id = mysql.run_id
              ->     and dolt_0222.test_name = mysql.test_name
              ->
              -> WHERE
              ->   dolt_0222.run_id = '7dffeeb22fb11efaec478ef3'
              ->   and dolt_0222.database = 'dolt'
              ->   and dolt_0222.committish = '0.22.2'
              ->   and dolt_0226.database = 'dolt'
              ->   and dolt_0226.committish = '0.22.6'
              ->   and mysql.database = 'mysql'
              -> ORDER BY
              ->   dolt_0222;
+-----------------------+-----------+-----------+---------+--------------------+--------------------+
| test_name             | dolt_0222 | dolt_0226 | mysql   | dolt_0222_multiple | dolt_0226_multiple |
+-----------------------+-----------+-----------+---------+--------------------+--------------------+
| oltp_insert           | 1302      | 1167      | 6099    | 4.7                | 5.2                |
| oltp_point_select     | 6401      | 4819      | 32925   | 5.1                | 6.8                |
| oltp_read_only        | 268       | 228       | 1894    | 7.1                | 8.3                |
| bulk_insert           | 308783    | 308783    | 2278717 | 7.4                | 7.4                |
| oltp_update_non_index | 803       | 767       | 5971    | 7.4                | 7.8                |
| select_random_points  | 457       | 400       | 4101    | 9                  | 10.3               |
| oltp_update_index     | 496       | 461       | 6199    | 12.5               | 13.4               |
| oltp_read_write       | 90        | 88        | 1328    | 14.8               | 15.1               |
| oltp_write_only       | 141       | 122       | 3335    | 23.7               | 27.3               |
| oltp_delete           | 883       | 789       | 21847   | 24.7               | 27.7               |
| select_random_ranges  | 97        | 93        | 6109    | 63                 | 65.7               |
+-----------------------+-----------+-----------+---------+--------------------+--------------------+

Conclusion

Our conclusion from this is that while Dolt performance has improved, there is still a long way to go. As we move further along with our implementation it will become progressively harder to find large performance improvements. This post laid out our performance commitments to our users and customers, present and future, and demonstrated our mechanism for repeatably measuring ourselves against that commitment with every release of Dolt.

Blog

Background

Expectations

Approach

Results

Conclusion

Get started with Dolt