Want to know a cool secret about database engines? They’re literally the same thing as compilers.
At my previous job, I worked on Google’s internal Java compiler. Now, I’m one of the developers of Dolt, the first version-controlled SQL database, as well as go-mysql-server, the SQL engine that Dolt depends on.
It turns out that the skills and techniques that I first learned while writing a compiler are the exact same techniques that Dolt uses in its database engine. And that’s because when you break it down, database engines are just a domain-specific compiler:
- General-purpose compilers turn human-readable source code into machine-readable programs that manipulate variables and produce side-effects.
- SQL engines turn human-readable SQL queries into machine-readable execution plans that manipulate table columns and produce a stream of rows.
Most compiler concepts map directly onto database engines. Some database engines like SQLite even work by producing bytecode that executes in a special-purpose VM.. Dolt doesn’t do that, but it does create an abstract syntax tree almost exactly like one you would see in any other interpreted language.
The main difference is that in most languages, evaluating a syntax tree produces a return value and side effects. In Dolt, evaluating a syntax tree produces an iterator. This also means that each intermediate node in the tree is an iterator defined in terms of one or more child iterators. This approach is often called the volcano model or iterator model of database engine design.
This means that database engines can have the same types of bugs as traditional languages and a lot of the same thorns. We recently fixed a correctness issue in Dolt that does a good job demonstrating this. You can find the writeup and the fix here.
To trigger the incorrect behavior, you needed a query that looked like this.
CREATE TABLE ab (a INT PRIMARY KEY, b INT);
CREATE TABLE three_pk (pk1 TINYINT, pk2 TINYINT, pk3 TINYINT, col TINYINT, PRIMARY KEY (pk1, pk2, pk3));
SELECT * FROM ab AS ab1 WHERE EXISTS (
SELECT * FROM ab AS ab2 WHERE EXISTS (
SELECT * FROM ab AS ab3 WHERE EXISTS (
SELECT * FROM three_pk WHERE pk1 = ab1.a and pk2 = ab2.a and pk3 = ab3.a
)
)
);
Running this query on older versions of Dolt would trigger a panic. To understand why, let’s talk about scopes.
Scopes#
Scopes are something that every programmer has to deal with even if they don’t realize it. It’s how programs resolve symbols to the things that are actually being referenced.
SQL queries reference tables and columns by name, and Dolt needs to map those names onto the tables and columns they represent. This is not as simple as it looks, because the same name can refer to different tables based on where it appears in the statement. The following is a valid SQL query:
SELECT * FROM test_table WHERE EXISTS (SELECT * FROM test_table where test_table.id = 1) and test_table.id = 2;
In this query, there are two tables named test_table and two filter conditions that name test_table. Which condition refers to which child iterator? If Dolt resolves these names incorrectly, it will produce incorrect output.
Two names can also refer to the same table, again depending on where the names appear. Consider this simple query with a two-part primary key:
CREATE TABLE test_table(pk1 INT, pk2 INT, PRIMARY KEY (pk1, pk2));
SELECT * FROM (SELECT * FROM test_table WHERE test_table.pk2 = 1) AS table_alias where table_alias.pk1 = 2;
This query can be optimized into a simple table lookup, but only if Dolt can detect that the two WHERE clauses refer to the same table, even though the two clauses use different table names.
Thus, it does not suffice for Dolt to simply keep a global dictionary that maps names onto tables, because the rules for resolving references are not global. Different parts of the query introduce their own namespace, which changes how names are resolved. These namespaces are scopes.
So how does a database engine keep these scopes straight? How are table references actually represented in the abstract syntax tree?
Scopes at Analysis Time#
The most naive approach is to simply store the same names in the AST as they appear in the query. Then, whenever the engine wants to analyze, optimize, or execute part of the tree, it resolves the name using scope rules. This works, but it’s incredibly brittle:
- Any optimization that transforms the AST needs to be very careful. If a node contains a reference, moving that node to another part of the tree could change the meaning of that reference, and change the behavior of the query. Whenever the engine transforms the tree, it would need to update any references.
- In fringe cases, a transformation may result in an impossible tree, where we need to reference a table but it’s impossible to do so because that table’s name is being shadowed by another.
- Needing to resolve references repeatedly is slow and wasteful.
This is actually how Dolt used to operate years ago, and it was the source of subtle bugs. So we switched to a better approach: when analyzing a query, we assign incrementing globally unique IDs to tables and columns. Every reference is resolved once, and then the name gets replaced with the ID. Since each ID always refers to the same table or column, we can safely modify the AST without any risk of changing the query’s meaning.
But this is only half of the situation of scopes.
Scopes at Runtime#
It’s not enough to just be able to resolve references to the table or column they represent, we also need to track those values while the query is running. Operations that produce column values need to be able to send those values to the operations that read them.
In general-purpose languages, this might be accomplished with registers, or by writing to values in memory. But SQL queries are declarative and functional and don’t have state: when evaluating a node in the syntax tree, the iterator it produces is often a pure function of:
- Columns inherited from parent nodes
- Example: In
SELECT id FROM a WHERE EXISTS (SELECT id FROM b WHERE a.id = b.id), the inner query references a column from the outer query.
- Example: In
- Columns returned from child nodes
- Example: In
SELECT id FROM (SELECT id FROM a) AS a_alias where a_alias.id = 1, the outer query references a column from the inner query.
- Example: In
- Columns defined in sibling nodes
- Example: In
SELECT id FROM a JOIN LATERAL (SELECT id FROM b WHERE a.id = b.id), the right side of the join references a column from the left side of the join.
- Example: In
Each of these columns can have multiple values over the lifetime of the query, but only one at a time. A SQL engine needs a strategy to represent this internally.
A naive approach might be to maintain a mapping from each column name to its current value. Except as we already saw, names don’t map one-to-one onto columns. In fact, it’s perfectly valid MySQL to for a table alias to have multiple columns with the same name. For example, the below query produces a table alias with two columns both named pk.
CREATE TABLE test_table(pk INT PRIMARY KEY);
SELECT * FROM (SELECT * FROM test_table JOIN test_table) AS table_alias;
Even more problematic is that a table alias column might have no name:
SELECT * FROM (SELECT 1+1) AS table_alias;
In either case, it’s not possible to reference these columns in filters, but they can still impact the results of the query if the alias is used in a SELECT *.
So if we can’t track values by their column name, another approach might be to use the unique column IDs we discussed in the previous section. But there could be many such IDs, and each node in the AST only cares about a small number of them. Managing lots of small maps is also not very performant, and we care about performance.
Fortunately, there’s an approach to that gives us both clarity and performance. Every scope in a SQL query always has the same number of columns. So the number of columns referenceable from any node in the AST is a constant value that can be determined by statically analyzing the query. The number of columns in that node’s iterator is also a constant value.
Examples:
- A table reference with N columns produces an iterator that returns a list of five values.
- A
SELECT col1, col2, ... colN WHERE EXISTS (...)construct creates N referenceable columns for every node within the subquery.
Each node can have an array where we store the value of each of these columns. Each element in the array corresponds to a different column. Before we evaluate the query, we can analyze the AST to determine which column each array element represents. Now if we want to represent an expression that reads a column, we don’t need to store the name of the column in the AST, and we don’t need to store that column’s globally unique ID either: all we need to store is the offset within that node’s own array corresponding to that column.
This is the approach that Dolt uses: it analyzes the tree, determines the exact set of columns visible to each node, and replaces column references with the correct offset into that node’s array that will contain that column at runtime.
We can illustrate this process with some diagrams. In each case, we show the nodes in the AST, and each node has both an ordered sequence of columns it produces (the output schema), and an ordered lists of columns it can reference (the input schema). The color of each cell indicates the node that originally produced the value. Note how nodes can contain column references from children, siblings, or parents, but in each case the number of columns can be statically determined.
A node referencing its child:
CREATE TABLE test_table(a int, b int, c int);
SELECT c+1 AS d FROM test_table;
Would result in an AST that looks like this:
A node referencing its parent:
CREATE TABLE test_table(a int, b int, c int);
SELECT a FROM test_table AS t1 WHERE EXISTS (SELECT b FROM test_table AS t2 WHERE t1.a = t2.a)
Would result in an AST that looks like this:
In order for this optimization to work correctly, every node must agree on how many columns it receives from each of their children, and how many columns are visible from parent and sibling scopes. If these numbers don’t agree, it could lead to situations where Dolt accesses the wrong value at runtime, or accesses the array out-of-bounds and panics.
So with all this in mind, let’s look back at the query that triggered the bug:
CREATE TABLE ab (a INT PRIMARY KEY, b INT);
CREATE TABLE three_pk (pk1 TINYINT, pk2 TINYINT, pk3 TINYINT, col TINYINT, PRIMARY KEY (pk1, pk2, pk3));
SELECT * FROM ab AS ab1 WHERE EXISTS (
SELECT * FROM ab AS ab2 WHERE EXISTS (
SELECT * FROM ab AS ab3 WHERE EXISTS (
SELECT * FROM three_pk WHERE pk1 = ab1.a and pk2 = ab2.a and pk3 = ab3.a
)
)
);
The simplest explanation for the root cause was this:
- When Dolt optimized this query, it would transform it into multiple nested join nodes.
- Each of these join nodes introduced a new table alias that could be referenced by its children.
- If a join condition referenced a column, then Dolt would need to compute the offset of that column in the join node’s array of column references. This required knowing how many of the columns in the array came from parent scopes, how many came from sibling scopes, and how many were from that node’s children.
- The logic for computing how many columns in the input schema came from outer scopes did not consider the fact scopes could be introduced in the middle of a nested join. This made it impossible to correctly calculate this for every node in the AST.
There were some attempts made to account for this issue, but in adjusting the calculations for common queries, it broke the calculations for less common queries. Ultimately, these adjustments made the logic harder to reason about. In the end, we fixed the issue by completely rewriting how we generated iterators during join in order to make it simpler to analyze them.
The full scope of the issue and the fix are more complicated, but this was the general idea.
I hope this illuminates what actually goes on inside a database engine.
As always, if you have any thoughts about database design, or if you’re curious how a version-controlled database can benefit you, then you should hop into our Discord and we’d love to discuss it with you.