Branch Activity is Slow - Opt In!

PERFORMANCE

5 min read

Here at DoltHub.com, we care a lot about the performance of the Dolt database. Which is why I’m here to tell you, with great sadness, that we disabled a feature I shared with you a few weeks ago. You can optionally enable it, so not all is lost. Read more below.

The dolt_branch_activity table provides information to administrators to determine how often a branch is being used on a given server. It allows for cleaning up dead work, managing connection pools, and whatever makes sense for your system. It was also announced as “always on” because it seemed like it would be useful to people.

Not anymore! After the feature landed, our nightly performance testing jobs went to work, and in the morning, we had a 3% regression in write query performance. Well, it’s not quite that simple. James has been grinding on performance of Dolt for a while now, and the goal is to get us as fast as MySQL for both reads and writes. We are getting close. This is all to say that James pays very close attention to the multiple performance tests we do regularly, and when he thought some of his improvements were going to deliver the goods, my change wiped out his hard work. After pondering why the gods hate him, he brought up my change as a possible culprit. I wasn’t surprised. Let’s discuss why.

Hot Code Path#

The dolt_branch_activity table provides the timestamp of the last read and write to each branch. This is information that needs to be gathered on hot code paths. It’s kind of common sense - if you want to track the time of the last write, you should add code which gets hit on every single write. Less obvious is that this data is not on the critical path. If the timestamp of branch usage is a little delayed or if you miss an event, your queries will still be correct. To account for these two facts, I decided to use a Go channel as a receiver of these events. The code roughly looks like this:

  activityChan := make(chan branchActivityEvent, 64),

...

	select {
	case t.activityChan <- branchActivityEvent{
		database:  database,
		branch:    branch,
		timestamp: time.Now(),
		eventType: WRITE,
	}:
	default:
		// Lots of traffic. drop the event
	}

Many read and write operations can be happening at the same time, and when they complete they drop a branchActivityEvent in the channel. There is a single thread which reads from the channel, and the work it does is basically hash table reads and updates so it’s easy to keep pace. In the unlikely situation where it can’t keep up, if there are more than 64 items backed up, the writers will quickly skip without inserting anything.

This was all done in order to avoid affecting performance. Nevertheless, the data shows that when this change is tested in isolation, it adds 2%-3% to the write query times. Unacceptable!

Why the Slowdown?#

The amount of code added on these hot code paths was not algorithmically complex, so there must be something else going on.

The first thing to rule out was contention. When you have multiple writers to a channel, they each need to acquire a lock before they can proceed. If there were dozens of concurrent operations happening, then inserting into the channel would be slow. One of the reasons I used a fixed size channel was to keep this lock acquisition to a minimum. Alas, it was easy to rule this root cause out; the single threaded performance test harness doesn’t have any contention. Ok, we can rule that out.

Second possibility to rule out was if the size of the branchActivityEvent was significant. If so, copying the bytes from the stack into the channel may be the culprit. The database and branch fields are strings, typically under 20 bytes of data. The eventType is just a single byte. The timestamp value was the last one, and just to be sure it wasn’t too big, I ran this piece of code:

func main() {
        var t time.Time
        fmt.Printf("Size of time.Time: %d bytes\n", unsafe.Sizeof(t))
}

And when I run it, I see that it is 24 bytes:

Size of time.Time: 24 bytes

OK, so even if we had really long branch names or database names, the total amount of data copied during the channel insert is under 100 bytes. I feel pretty confident that’s not the problem. Just to be sure though, James tested using a pointer to avoid the memory copy and it made no difference. Ok, we can rule that out too.

Which brings us to inserting into a channel. This is really the third and last option for where the performance penalty is coming from, and we can’t actually work around it in any way. It’s pretty foundational to the language, and I chose channels specifically because they are supposed to help with this kind of access pattern.

Without ever looking at the code, I assumed there was at least a mutex acquired and some other overhead to insert into a channel. Then I looked at the code and it’s actually a lot more complicated than that. This shouldn’t really come as a surprise. Channels aren’t just queues - they also need to sleep threads and wake them up when appropriate. They are also a critical part of the Go language, and many smart people have been hammering on that thing for years.

I’m left with the takeaway that I can’t use channels on Dolt’s primary query path. Doing so will hit our performance more than we are willing to accept. This is especially true when you consider that contention on the channel will be a concern with a server running real traffic. That 3% performance regression could balloon to bring your server to its knees if we aren’t careful. So we decided to turn this behavior off and users are now forced to turn it on if they want it.

You Want Branch Activity Back?#

If you attempt to use the dolt_branch_activity table in the latest version of Dolt, you’ll get an error:

mysb/main> select * from dolt_branch_activity;
Error 1105 (HY000): branch activity tracking is not enabled; enable it in the server config with 'behavior.branch_activity_tracking: true'

If you aren’t desperate for every cycle on your server and you want this behavior, you can optionally enable it with server config. There is a new behavior field:

behavior:
  branch_activity_tracking: true

This field doesn’t show up in the default generated config.yaml, so you’ll need to add it. When you’ve updated the file, start your server with:

dolt sql-server --config config.yaml

Then run the same query again:

mydb/main> select * from dolt_branch_activity;
+--------+---------------------+------------+-----------------+---------------------+
| branch | last_read           | last_write | active_sessions | system_start_time   |
+--------+---------------------+------------+-----------------+---------------------+
| main   | 2025-11-18 20:38:19 | NULL       | 1               | 2025-11-18 20:38:15 |
+--------+---------------------+------------+-----------------+---------------------+

Conclusion#

We are very close to having performance parity with MySQL, and we aren’t going to sacrifice that for nice-to-have bells and whistles. While features are important and I love creating them for you, for this particular one you’ll need to opt-in. We’ll continue to keep the bar high, and that’s why you can depend on Dolt.

Come tell us what you’re building on our Discord server!

Blog

Hot Code Path#

Why the Slowdown?#

You Want Branch Activity Back?#

Conclusion#

Get started with Dolt