How it Started
For those following along, we've been working on improving Dolt's performance with the goal of making Dolt no more than 2-4 times slower than MySQL. When we set out to measure Dolt's performance we chose Sysbench, a widely used open-source RDBMS performance tool.
Initially, we created a containerized way to compare Dolt's performance to MySQL's and began publishing the benchmarks to our documentation site. This was in response to a growing number of customer requests asking for Dolt performance improvements which were heavily aligned with the "Dolt as an application database" use case.
As part of our performance goals, we wanted to increase our team's output, velocity, and overall performance impact by providing them benchmarking reports on their pull requests. With a single "#benchmark" comment, our developers began receiving reports displaying how their proposed code changes affected Dolt's Sysbench numbers before their changes landed in our main branch.
In a recent post about this new internal feature, I pointed out that one of the challenges we've faced when measuring Dolt's performance against MySQL has been generating consistent results, even across the same version of a database.
When we first started using Sysbench, we looked at the number of transactions that occurred during a 10 second testing window in both Dolt and MySQL, but found that in both databases, total transactions proved to be too inconsistent to rely on as a performance baseline.
For example, despite running the same test with the same inputs at the same database version, two testing runs produced:
First Run: 1332 transactions
Second Run: 1438 transactions
The results above show a 8% difference between the first and second run with some tests showing an even larger percentage difference.
We also investigated using the average query latency (total queries / execution time) as a representative value of performance, but found this number to be unreliable as well. Despite running each test for 10 seconds, we saw subtle differences in their actual execution time. Across two runs with identical inputs we saw:
First Run: 10.0013 seconds
Second Run: 9.9863 seconds
This meant we needed to revisit the Sysbench inputs to find a configuration that produced consistent, low variance results.
After exploring total transactions and average query latency as reliable signals for Dolt's performance, we succeeded in finding the right set of Sysbench inputs and the right metric to track—the query latency median. Finding this metric has significantly advanced our performance optimization efforts and we now monitor this metric closely.
How it's Going
To generate reliable, repeatable benchmarking results we started by increasing the testing execution time window from 10 seconds to 2 minutes, drastically increasing our sample size of executed queries. Then we supplied the
--percentile=50 argument to each test during our benchmarking runs. In Sysbench's own words:
Sysbench measures execution times for all
processed requests to display statistical
information like minimal, average and
maximum execution time. For most benchmarks
it is also useful to know a request execution
time value matching some percentile (e.g.
95% percentile means we should drop 5% of the
most long requests and choose the maximal
value from the remaining ones).
--percentile option allows you to] specify a percentile
rank of query execution times to count.
For our purposes, these adjustments were crucial. Once we started monitoring the differences between query latency at the 50th percentile across runs, we saw the variance between runs drop to around 5% at worst, often seeing a 0% difference between runs!
This helps massively with reliably measuring one aspect of the relative performance between Dolt and MySQL, and it's what we're most excited to measure currently, although we acknowledge this isn't the whole performance story.
For example, we're explicitly not measuring differences in the variance of performance between MySQL and Dolt. If there are large differences there, we will miss them. We realize that's something our users would be interested in, and we hope to expand our coverage of relative performance measurements in the future.
Since solidifying our new standard, we also did some work to control for additional conditions that might affect benchmarking results.
At the time of this writing, all published benchmarks are generated on M5a.2xlarge AWS EC2 instance types using the base Golang 1.15.x container environment.
Each instance is provisioned on-demand whenever a benchmarking job is scheduled and only a single benchmarking job runs on each instance at a time, ensuring no other jobs consume instance resources.
We initially benchmarked Dolt against MySQL 5.7 but have started benchmarking against MySQL 8.0.22. We connect to MySQL in server mode via a unix socket and we benchmark against Dolt in server mode (
dolt sql-server) connecting via TCP.
Each Sysbench test is run with the following arguments and follows a
We use a collection of Sysbench's built-in OLTP tests as we optimize for the application database use case, but have also added custom tests that measure Dolt's table scanning performance, with and without indexes, as well as Dolt's covering index performance.
You can find the benchmarking results of the current Dolt release here, but even in the short period of time since we began focusing on improving performance we can see the value of our developer's hard work.
Let's take a look at the benchmarking results for Dolt at version 0.22.2 versus Dolt's latest numbers with our current release, version 0.22.13:
As you can see from the above table, we've made some great performance improvements, indicated by the preceding minus sign in the "Percent Change" column.
covering_index_scan has gotten 30% faster and
table_scan 47% faster.
You can also see we've had a couple regressions, indicated by the preceding plus sign in the "Percent Change" column.
scan_random_ranges both got slower.
But what's important here, and what most excites us, is that we now have the infrastructure in place to better track and measure both performance improvements and regressions. Tracking them helps us identify problems and attack them head-on. We are well on our way to reaching Dolt's SQL performance goals and hope to make Dolt the best choice for versioning your data.
Curious about Dolt, DoltHub and the versioned database of the future? There's no better place to get started than DoltHub.com where you can download Dolt, host your own public and private repositories, or just clone some amazing public repositories you won't find anywhere else.
Questions, comments, or looking to start publishing your data in Dolt? Get in touch with our team here!