Hitting 99% Correctness for our SQL Database

2 min read

Dolt is a SQL database with Git-style versioning. One of our biggest priorities is ensuring that Dolt is a drop in replacement for any MySQL database. That means any query that can be run on a MySQL database must run correctly on a Dolt as well.

To help measure this product objective, we adopted the sqllogictest package about a year and a half ago. It has about 6 million queries (along with the expected results) that were originally written to ensure the correctness of SQLite. 2 months later we hit 89% correctness.

Today we are proud to announce that we have hit 99% correctness on the sqllogictest package. Let's highlight a few SQL features that helped get our score up! Each of these features increased our score by about 2% (~100,000 test cases).

SQL Views

In February 2020, Aaron implemented SQL Views. SQL views are used to save commonly repeated queries. Dolt even versions them for you!

For example, creating a view looks like this:

CREATE VIEW getnames
AS
SELECT first_name, last_name FROM contacts;

To use it simply do

SELECT * from getnames;

Secondary Indexes

Back in May 2020, Daylon implemented Secondary indexes. Indexes are essentially maps that allow for faster retrievals of certain keys.

Here's an example of a UNIQUE secondary index used so no duplicate emails exists.

CREATE TABLE `contacts` (
    id int PRIMARY KEY,
    first_name varchar(20),
    last_name varachar(20),
    email varchar(20) UNIQUE,
)

Window Functions

In February 2021, Zach implemented window functions. Window functions allow for more expressive computations on top of aggregations. The most popular one is ROW_NUMBER()

SELECT
  id,
  ROW_NUMBER() OVER w AS 'row_number',
FROM contact
WINDOW w AS (ORDER BY id);

Distinct Expressions for functions

Last month, I implemented DISTINCT uniqueness for functions. This syntax is useful for computations on de-duplicated data. For example, if we wanted to count all of the different first names we have, we can do

SELECT COUNT(DISTINCT *) FROM contacts

Conclusion

Our pursuit of 99% correctness has further validated that Dolt is a database. We will continue to ship more MySQL compliant features to hit 100% correctness along with much better performance. If you are curious about Dolt please join our discord here and learn more!

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt