We're writing Dolt, the world's first version controlled SQL database. Dolt
advertises itself as a drop-in replacement for MySQL, so if your current application works on MySQL
you can swap in Dolt on the backend with no code changes. It should just work, out of the box.
And we take that promise of compatibility very seriously! So seriously that we're willing to
reproduce some very surprising MySQL behavior that some people might call bugs, but we call
Today we'll be discussing a bug we discovered in MySQL that MySQL disputes is actually a bug, but we
decided to fix in Dolt anyway. This decision to deviate from MySQL's behavior was an exception to
our general rule, but we think it was warranted. Keep reading to see if you agree.
But first, to give you a taste of how tricky it can be to decide if strange MySQL behavior should be
faithfully replicated for compatibility, let's take a look at some of the tougher decisions we've
had to make over the last few years.
Andy wrote about MySQL's especially limber type
system, where almost any type is silently
coerced into almost any other type, sometimes with very surprising results. Take a look and marvel.
CREATE TABLE types_table (
pk int primary key, s varchar(20), b bool
INSERT INTO types_table VALUES
(0, "abc", true),
(1, "xyz", false);
SELECT * FROM types_table WHERE b = "abc"
| pk | s | b |
| 1 | xyz | 0 |
1 row in set, 1 warning (0.00 sec)
Here MySQL sees that you have requested to compare a
tinyint(1) with a string literal, and
helpfully casts the string to the integer value
0, giving the above result.
Does this do what you meant to do? Probably not, but whether you meant it or not, Dolt behaves the
same way. Drop-in replacement.
Earlier this year Daylon wrapped up a large chunk of work
related to regular expression matching. He did a heroic job getting ICU, the C regular expression
library MySQL uses, compiled down to WASM so
we could build it into our Go binary. This gave us bug-for-bug MySQL compatibility with regular
expressions, but during testing he discovered something strange:
CREATE TABLE test (v1 VARCHAR(255) COLLATE utf8mb4_0900_as_ci);
INSERT INTO test VALUES ('、');
SELECT * FROM test WHERE v1 = '﹑';
SELECT * FROM test WHERE v1 LIKE '﹑';
SELECT * FROM test WHERE v1 REGEXP '﹑';
The above characters, which look very similar, are the same UTF letter in different cases. Equality
LIKE comparisons correctly fold the case because the column has a case-insensitive
REGEXP doesn't, because MySQL regexes don't support collations.
Is this a bug, or a feature? Well, if you're writing a new application you probably consider it a
bug. But if you're porting an existing one and expect a drop-in replacement, it might be a
feature. So for now, this behavior stays.
Result set naming
Very early in our development process we encountered a customer trying to integrate their
application with Dolt, and it was not behaving as expected. Specifically, the result set column he
was looking for didn't appear to be in the result set. When I wrote about this two years ago I
called the process of fixing this customer bug "Copying all of MySQL's dumbest
decisions", and while I
basically still stand by that title there are honestly way dumber decisions we've copied for the
sake of compatibility.
Specifically, MySQL has very particular behavior about the schema of result sets, exactly echoing
the input string. And some customers hard-code the names of these expected result columns in their
mysql> select CURRENT_user( );
| CURRENT_user( ) |
| zachmu@% |
1 row in set (0.00 sec)
mysql> select 1 + 1 + 2;
| 1 + 1 + 2 |
| 4 |
1 row in set (0.01 sec)
Getting this to work correctly took a bunch of work and there are still edge cases where we get it
wrong. But on the positive side, this work inadvertently exposed a bunch of errors in our handling
of aliases, so it was worth it for that reason alone. And more importantly: drop-in replacement.
An actual bug in MySQL
The above examples are all ambiguous in terms of whether you consider them to actually be bugs. If
you expect strange and borderline indefensible behavior as a long-time user of a tool, is it still a
We grapple with this issue on a fairly regular basis, and there is rarely a straightforward
answer. Our default is always, "copy MySQL, drop-in replacement", which ends most arguments.
But today, the unexpected happened. I found a bug in MySQL.
I was grinding away to implement virtual columns, a long-requested feature on our
roadmap. The feature has a lot in common with column
defaults, so I was busy adding a lot of tests for those, when I discovered a sequence of statements
that didn't work. It looked like this:
CREATE TABLE t1 (a int default 1, b int default (a+1));
INSERT INTO t1 (b,a) values (DEFAULT, 3);
On my Dolt development branch, this resulted in a
NULL value being inserted for the
column. This was obviously incorrect, but I paused and became curious what MySQL does in this
case. Is the statement even legal? It ought to be, but MySQL puts a lot of constraints on the
ordering of columns whose default values reference one another. Maybe it's illegal in an
But when I tried this in MySQL, I got back a completely indefensible result.
mysql> select * from t1;
| a | b |
| 3 | 2 |
1 row in set (0.00 sec)
2? It should be
4. Or produce an error, or something. But not
2. There's no
universe in which that result could be considered correct, it's just wrong. But how could it be that
this bug in a simple
INSERT statement exists in 30 year old software, that it went undetected for
MySQL has some famously elderly bugs, like this one that became eligible to vote earlier this
year. Surely somebody had reported this buggy behavior
before, it must be one of those. But nope. There are 45 active
DEFAULT bugs in the MySQL bug
and this wasn't one of them.
It is now. Friends, I created an Oracle account and gave
them my home address so I could file this bug. I told Larry Ellison his product has a defect and
told him where I live.
Then, awash in the glow of my good Samaritan vibes, I started scheming. How cool would it be if Dolt
was able to fix this bug the same day that I filed it against MySQL? Just yesterday we published a
blog post about how we now have 4 9's of correctness against
sqllogictest, the 5.7 million SQL
query test suite developed by SQLite3. 4 9's of correctness is pretty cool, but what's even cooler
is being more correct than MySQL, even in some tiny way. So that's what I did, I was already in
there anyway. Here's the PR that fixes it.
On my development branch of Dolt, this query now works correctly, where it does not in MySQL.
% dolt sql
defaults> CREATE TABLE t1 (a int default 1, b int default (a+1));
defaults> INSERT INTO t1 (b,a) values (DEFAULT, 3);
Query OK, 1 row affected (0.00 sec)
defaults> select * from t1;
| a | b |
| 3 | 4 |
1 row in set (0.00 sec)
But does any of this matter?
You might well ask: what sane person would use this weird syntax for an insert? This is how you're
supposed to write an insert if you want the
DEFAULT value for some column, and it works fine.
INSERT INTO t1 (a) values (3);
That's it, there's no need to use the
DEFAULT keyword in this statement at all. You just don't
provide a value for that column to give it the default one, that's why it's called the default.
The problem, as we've learned in excruciating detail these last 3 years chasing compatibility, is
that hardly anybody writes artisanal SQL queries by hand these days. They're all using libraries
that generate the queries for them, and most of the time they have no idea what the tool is doing in
the background. They only care when it breaks.
Friends, I've seen queries generated by these things that would make your hair turn shock
white. I've seen esoteric SQL syntax and use of the
information_schema tables that would put you
in bed for a week. I've seen queries they produce that have made a grown man weep for mercy.
This is what we're up against, and it's what makes compatibility so, so important. Although,
ironically enough, the fact that the bug I just fixed exists in MySQL insulates us somewhat in this
case -- they probably tested it against MySQL and discovered it didn't work. But then again, maybe
they didn't! SQLAlchemy in particular is developed at an absolutely furious, churning pace. It
wouldn't surprise me in the least to discover five years from now that SQLAlchemy has bug reports
asking them to not use this syntax. And the lead developer will ask, confused,
But it works in Dolt, why doesn't it work in MySQL?
Update: but wait, is this even a bug?
Publication day update: MySQL closed my bug as "Not a
Bug". Here's their rationale:
Hi Mr. Musgrave,
Thank you for your bug report.
However, this is not a bug.
If column B has a default that is an expression that depends on the value of column A, then column
A must precede the list of columns in the INSERT statement. According to SQL standard, columns are
evaluated in the order that they are given.
Not a bug.
Now, I'm a software engineer by trade. I have a fine, one might say aristocratic appreciation of
all the nuanced, many-splendored ways one can tell a customer to go kick rocks to keep your bug
queue manageable, and appealing to the authority of a standard is one of the absolute best ways to
do this. That said, I gave Larry Ellison my home address to submit this bug and I demand
satisfaction. I have some objections here.
First of all, I cannot download "the SQL standard" to examine the truth value of this claim, because
a) no such single document exists, and b) the dozens of publications from ISO spanning five decades
that comprise the actual standard are proprietary. I learned this when I tried to read the standard
to implement the precise syntax for snapshot queries (
FOR SYSTEM_TIME AS OF ...). I had to resort
to learning about the standard second-hand from academic publications discussing the changes to
it. Here's a partial listing of the documents comprising the standard:
Each of them runs into the thousands of pages, and you can buy them for the low, low price of 208
Swiss Francs apiece:
But second, and maybe more importantly, MySQL has a well-earned reputation for running roughshod
over the SQL standards whenever it suits them. They have done things to the standard that would make
a sailor blush. They're hardly the only database vendor to do this -- database behavior is
notoriously vendor-specific despite the existence of the ISO standards. But MySQL in particular was
a running joke for decades (toy database!) in large part because of their lax adherence to
I decided to see how other database vendors handled column defaults in this scenario. First, SQLite3:
So SQLite3 doesn't even allow column references in defaults. But that's understandable, they're not
really an OLTP production database, let's try Postgres.
OK that's a little surprising, I guess Postgres doesn't allow this either. Maybe Microsoft wanted to
differentiate themselves on this point, let's try SQL Server.
Nope, no support from Microsoft either. I'm starting to feel like I'm taking crazy pills. One more:
the granddaddy of enterprise SQL databases, Oracle.
OK, so let's review:
- Of the five most popular SQL database vendors, only MySQL supports using column references in
default values at all.
- MySQL's implementation silently produces nonsensical results when the column values are inserted
out of order.
- This behavior is actually part of the SQL standard.
So is this a bug, or is it, as MySQL claims, working as intended according to the SQL standard?
I'll let you decide, but we're considering it a bug and making it just work correctly in Dolt.
We hope this blog gives you a taste of the challenges we face trying to fulfill our promise of being
a drop-in replacement for MySQL, and join us in celebrating this one brief shining moment when we
were able, with a clear conscience, to become more correct than MySQL in one small, beautiful
way. The hunt continues.
Have questions about Dolt or SQL testing? Join us on Discord to
talk to our engineering team and meet other Dolt users.