Maxwell's Daemon Works with Dolt
Dolt is the world's first version-controlled relational database. Dolt lets you branch, fork, clone, merge, and diff your relational data, in all the same ways Git lets you work with your files. Because Dolt implements MySQL's dialect and wire protocol (without actually including any MySQL code), you can use Dolt with any tool that works with MySQL. In today's blog, we're looking at a popular tool for working with MySQL replication streams and showing how it works with Dolt.
Maxwell’s Daemon, often referred to simply as "Maxwell", is a change-data-capture (CDC) tool developed by Zendesk that reads from a MySQL replication stream and converts the data changes into structured JSON events. These JSON events are then delivered to streaming platforms such as Kafka, Amazon Kinesis, RabbitMQ, Google Pub/Sub, Redis, and many more. Designed for efficient and real-time data processing, Maxwell’s Daemon is useful for audit logging, cache invalidation, search index updates, ETL pipelines, inter-service communication, and many other scenarios. It often functions as a convenient adapter between a relational data store and a streaming data processing system.
In the rest of this blog post, we show how to set up Maxwell's Daemon with Dolt and show some data passing through the system.
Demo
Configure a Dolt SQL Server
The first thing we need to do is get a Dolt SQL server set up for Maxwell's Daemon to connect to it. If you don't have Dolt installed on your system yet, you can follow these instructions for your platform.
Once you've got the dolt
tool install, create a new directory and start up a Dolt SQL server in it:
mkdir dolt && cd dolt
dolt sql-server --port 11229 --loglevel DEBUG
In a new terminal session, make sure you cd
to the same dolt
directory where we started the dolt sql-server
above, then start a SQL shell:
dolt sql
Once you're connected to the SQL shell, run the following commands:
CREATE USER 'maxwell'@'%' IDENTIFIED BY 'XXXXXX';
CREATE USER 'maxwell'@'localhost' IDENTIFIED BY 'XXXXXX';
GRANT ALL ON maxwell.* TO 'maxwell'@'%';
GRANT ALL ON maxwell.* TO 'maxwell'@'localhost';
GRANT CREATE, SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'maxwell'@'%';
GRANT CREATE, SELECT, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'maxwell'@'localhost';
SET @@PERSIST.gtid_mode='ON';
SET @@PERSIST.enforce_gtid_consistency='ON';
SET @@PERSIST.log_bin=1;
SET @@PERSIST.binlog_ignore_dbs="maxwell";
The commands above create a new user account, named maxwell
, and grants it privileges to create tables, run select queries, and request a replication stream. Note that two separate user accounts are created: maxwell@%
and maxwell@localhost
. MySQL treats the %
symbol as a wildcard that matches any host, except for localhost
, which is why we also create the maxwell@localhost
user account. If we only created the maxwell@%
user, then we wouldn't be able to connect from localhost.
The last four statements configure system variables to enable and configure replication. gtid_mode
, enforce_gtid_consistency
, and log_bin
must all be enabled for the Dolt SQL server to stream replication changes.
Note that we're setting all these system variables with SET PERSIST
so that the values are persisted to disk and will be present when the SQL server is restarted. Without setting these persistently, when we restart the SQL server, these values would not be present and replication would not be enabled.
Now that we've configured our SQL server to allow Maxwell's Daemon to connect, we need to restart the SQL server so that the replication settings are picked up and replication becomes enabled. Back in the terminal running dolt sql-server
use CTRL-C to stop the SQL server and then restart the SQL server:
dolt sql-server --port 11229 --loglevel DEBUG
You should see output similar to the output below. In particular, look for the "Enabling binary logging for branch main" INFO message to confirm that binary logging and replication are enabled.
Starting server with Config HP="localhost:11229"|T="28800000"|R="false"|L="debug"|S="/tmp/mysql.sock"
DEBU[0000] Loading events
INFO[0000] Enabling binary logging for branch main
DEBU[0000] no available GTIDs found in logs, setting gtid_purged to:
DEBU[0000] setting availableGtids to after removing purgedGtids
DEBU[0000] privileges.db already exists, not creating root superuser
INFO[0000] Server ready. Accepting connections.
WARN[0000] secure_file_priv is set to "", which is insecure.
WARN[0000] Any user with GRANT FILE privileges will be able to read any file which the sql-server process can read.
WARN[0000] Please consider restarting the server with secure_file_priv set to a safe (or non-existent) directory.
Once the SQL server starts up, go back to your other terminal with the SQL shell connection and run the following command to confirm that the correct settings have been configured:
SHOW VARIABLES WHERE Variable_Name LIKE '%gtid_mode' OR Variable_Name LIKE '%enforce_gtid_consistency' OR Variable_Name LIKE '%binlog_format' OR Variable_Name LIKE 'server_id' OR Variable_Name LIKE 'log_bin' OR Variable_Name LIKE 'binlog_ignore_dbs';
+--------------------------+---------+
| Variable_name | Value |
+--------------------------+---------+
| binlog_format | ROW |
| enforce_gtid_consistency | ON |
| gtid_mode | ON |
| log_bin | ON |
| binlog_ignore_dbs | maxwell |
| server_id | 1 |
+--------------------------+---------+
If your settings match the values above, then you're ready to move on to the next section and start up Maxwell's Daemon.
Start Maxwell's Daemon
Once your Dolt SQL server is configured and replication is enabled, you're ready to start up Maxwell's Daemon. Download Maxwell's Daemon and unzip it into a location on your disk. In that unzipped directory, you'll find the maxwell
command in the bin
directory. You can add the bin
directory to your PATH
environment variable, or just use the full path when you invoke maxwell
. On my system, I added it to my PATH
with the following command:
SET PATH=$PATH:/Users/jason/Downloads/maxwell-1.41.2/bin
When you're ready, run this command in a new terminal shell to start Maxwell's Daemon:
maxwell --user='maxwell' --password='XXXXXX' --host='127.0.0.1' --port=11229 --producer=stdout --gtid_mode
Let's walk through the options we specify for maxwell
:
--user='maxwell'
and--password='XXXXXX'
– this is the account name and password used to authenticate to our SQL server.--host='127.0.0.1'
and--port=11229
– this tells Maxwell's Daemon to connect to localhost on port 11229 to look for our SQL server.--producer=stdout
– this specifies which "producer" to use. Here we use the producer that dumps data change events out to stdout, but many, many other producers exist.--gtid_mode
– this parameter tells Maxwell's Daemon to use GTIDs to position itself in the replication stream. This argument is required, because Dolt only supports GTID-based positioning for replication.
If Maxwell starts up correctly, you should see output similar to this:
2025-08-26 14:06:32 INFO Maxwell - Starting Maxwell. maxMemory: 17179869184 bufferMemoryUsage: 0.25
2025-08-26 14:06:32 INFO SchemaStoreSchema - Creating maxwell database
2025-08-26 14:06:32 INFO Maxwell - Maxwell v1.41.2 is booting (StdoutProducer), starting at Position[BinlogPosition[], lastHeartbeat=0]
2025-08-26 14:06:32 INFO AbstractSchemaStore - Maxwell is capturing initial schema
2025-08-26 14:06:32 INFO BinlogConnectorReplicator - Setting initial gtid to:
2025-08-26 14:06:32 INFO BinaryLogClient - Connected to 127.0.0.1:11229 at (sid:6379, cid:17)
2025-08-26 14:06:32 INFO BinlogConnectorReplicator - Binlog connected.
Testing with Data
Now that we've got a replication connection between our Dolt SQL server and the maxwell
tool, let's make some data changes and see how they get logged in the terminal running maxwell
. Remember that in our simple demo, we just used the stdout
producer, which dumps all change events to stdout so we can see them in the terminal.
Go back to the terminal with the SQL shell to our Dolt SQL server and create a new database for us to test with:
CREATE DATABASE myReplicatedDb;
USE myReplicatedDb;
Back in the maxwell
terminal, we see one new line of output logged:
2025-08-26 14:13:13 INFO AbstractSchemaStore - storing schema @Position[BinlogPosition[8e66e4f4-955a-4844-909a-33d79f78ddba:1-1], lastHeartbeat=0] after applying "create database `myReplicatedDb`;" to myReplicatedDb, new schema id is 2
Now let's go back to the terminal with our Dolt SQL shell and create some data:
CREATE TABLE famousPhysicists(pk int primary key, name varchar(200));
INSERT INTO famousPhysicists VALUES (1, 'Maxwell'), (2, 'Faraday'), (3, 'Hertz');
CALL dolt_commit('-Am', 'adding initial famous physicists');
Back in the maxwell
terminal, we see new output describing the changes that just happened:
2025-08-26 14:23:55 INFO AbstractSchemaStore - storing schema @Position[BinlogPosition[8e66e4f4-955a-4844-909a-33d79f78ddba:1-2], lastHeartbeat=0] after applying "CREATE TABLE `famousPhysicists` ( `pk` int NOT NULL, `name` varchar(200), PRIMARY KEY (`pk`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_bin;" to myReplicatedDb, new schema id is 3
{"database":"myReplicatedDb","table":"famousPhysicists","type":"insert","ts":1756242216,"xid":0,"xoffset":0,"data":{"pk":1,"name":"Maxwell"}}
{"database":"myReplicatedDb","table":"famousPhysicists","type":"insert","ts":1756242216,"xid":0,"xoffset":1,"data":{"pk":2,"name":"Faraday"}}
{"database":"myReplicatedDb","table":"famousPhysicists","type":"insert","ts":1756242216,"xid":0,"commit":true,"data":{"pk":3,"name":"Hertz"}}
The first line indicates a DDL or schema change and includes the CREATE TABLE
statement we executed on Dolt. The next three lines show the data we inserted – one JSON entry for each row inserted. Looking closer at the data in these JSON structures, we see the top level fields describe metadata, such as which database and which table were changed, as well as what type of change it is (i.e. insert
, update
, or delete
). The data
field shows us the exact values in the row that were inserted.
Let's jump back to the terminal with our Dolt SQL shell and try modifying some data. This time, we'll add a new column and update the rows to insert its value:
ALTER TABLE famousPhysicists ADD COLUMN birthdate date;
UPDATE famousPhysicists SET birthdate='1831-06-13' where name='Maxwell';
UPDATE famousPhysicists SET birthdate='1857-02-22' where name='Hertz';
UPDATE famousPhysicists SET birthdate='1791-09-22' where name='Faraday';
CALL dolt_commit('-am', 'adding birthdates');
Now, back in the maxwell
terminal, we see four new lines of output describing the changes we just made:
2025-08-26 15:00:48 INFO AbstractSchemaStore - storing schema @Position[BinlogPosition[8e66e4f4-955a-4844-909a-33d79f78ddba:1-8], lastHeartbeat=0] after applying "ALTER TABLE `famousPhysicists` ADD `birthdate` date;" to myReplicatedDb, new schema id is 8
{"database":"myReplicatedDb","table":"famousPhysicists","type":"update","ts":1756242216,"xid":0,"commit":true,"data":{"pk":1,"name":"Maxwell","birthdate":"1831-06-13"},"old":{"birthdate":null}}
{"database":"myReplicatedDb","table":"famousPhysicists","type":"update","ts":1756242216,"xid":0,"commit":true,"data":{"pk":3,"name":"Hertz","birthdate":"1857-02-22"},"old":{"birthdate":null}}
{"database":"myReplicatedDb","table":"famousPhysicists","type":"update","ts":1756242216,"xid":0,"commit":true,"data":{"pk":2,"name":"Faraday","birthdate":"1791-09-22"},"old":{"birthdate":null}}
The first line describes the schema change and includes the ALTER TABLE
statement we executed. The next three lines are JSON structures that describe each row that changed. Notice that this time, the type is marked as an update
and we see the new values for all the fields in the row, as well as a new old
field that shows the old value for any fields that were updated. You could imagine a processing system receiving these events, looking at the changes, and taking an action, such as populating a cache, storing an audit log, or even firing an alarm for some types of data changes.
Let's go back to the terminal with our Dolt SQL shell one more time and see what happens when we use some of Dolt's version control features to change our data and see how those changes appear to maxwell
. Dolt's version control features mirror Git's version control features, so you have stored procedures that look like: dolt_reset()
, dolt_revert()
, dolt_merge()
, dolt_branch()
, dolt_push()
, dolt_pull()
, etc. In our example, we're going to look at our commit log, then use dolt_reset()
to move our commit pointer back one commit.
select commit_hash, message from dolt_log;
+----------------------------------+----------------------------------+
| commit_hash | message |
+----------------------------------+----------------------------------+
| 94818emjsfargamndnmuclvc526e50h6 | adding birthdates |
| 41l6jcn8gt231tkuttq1po4dgb4lea75 | adding initial famous physicists |
| ljgtpfqjp83l9pjrc9nkrttsa15op8g2 | Initialize data repository |
+----------------------------------+----------------------------------+
3 rows in set (0.00 sec)
Now we'll use dolt_reset()
to move the HEAD
pointer for the main branch back to the previous commit, 41l6jcn8gt231tkuttq1po4dgb4lea75
, where we first created the table and inserted some sample data, but before where we added birthdates.
call dolt_reset('--hard', '41l6jcn8gt231tkuttq1po4dgb4lea75');
+--------+
| status |
+--------+
| 0 |
+--------+
1 row in set (0.02 sec)
Just to sanity check, let's look at the famousPhysicists
table and confirm the data looks like what we expect:
select * from famousPhysicists;
+----+---------+
| pk | name |
+----+---------+
| 1 | Maxwell |
| 2 | Faraday |
| 3 | Hertz |
+----+---------+
3 rows in set (0.00 sec)
Yup, this is what our table looked like before we added the birthdate column. Let's jump back over to our maxwell
terminal and see what it logged:
2025-08-26 15:07:58 INFO AbstractSchemaStore - storing schema @Position[BinlogPosition[8e66e4f4-955a-4844-909a-33d79f78ddba:1-12], lastHeartbeat=0] after applying "ALTER TABLE `famousPhysicists` DROP `birthdate`;" to myReplicatedDb, new schema id is 9
{"database":"myReplicatedDb","table":"famousPhysicists","type":"update","ts":1756242216,"xid":0,"xoffset":0,"data":{"pk":1,"name":"Maxwell"}}
{"database":"myReplicatedDb","table":"famousPhysicists","type":"update","ts":1756242216,"xid":0,"xoffset":1,"data":{"pk":2,"name":"Faraday"}}
{"database":"myReplicatedDb","table":"famousPhysicists","type":"update","ts":1756242216,"xid":0,"commit":true,"data":{"pk":3,"name":"Hertz"}}
Similar to the last change events we looked at, the first line describes the SQL ALTER TABLE
statement that modifies our table to the current schema. Note that we didn't explicitly execute this SQL command – it's just the SQL equivalent to describe the result of the dolt_reset()
command we executed. The following three lines show the values of the rows in our table after the birthdate
column has been removed.
Conclusion
Maxwell's Daemon is an easy-to-use adapter that connects to a MySQL replication and sends the data change events to a large number of other systems, such as: Amazon Kinesis, RabitMQ, Kafka, Google Pubsub, Amazon SQS, Redis, and more. In our simple example, we used the stdout
provider, but it's easy to configure a different producer. Because Dolt supports MySQL binary log replication, it's easy to use Maxwell's Daemon with a Dolt SQL server to add change data capture features to your system.
If you're curious about using a version-controlled relational database, or if you want to talk more about change data capture use cases, come by our Discord and say hello.