Hitting 99% Correctness for our SQL Database

SQL
1 min read

Dolt is a SQL database with Git-style versioning. One of our biggest priorities is ensuring that Dolt is a drop in replacement for any MySQL database. That means any query that can be run on a MySQL database must run correctly on a Dolt as well.

To help measure this product objective, we adopted the sqllogictest package about a year and a half ago. It has about 6 million queries (along with the expected results) that were originally written to ensure the correctness of SQLite. 2 months later we hit 89% correctness.

Today we are proud to announce that we have hit 99% correctness on the sqllogictest package. Let's highlight a few SQL features that helped get our score up! Each of these features increased our score by about 2% (~100,000 test cases).

SQL Views

In February 2020, Aaron implemented SQL Views. SQL views are used to save commonly repeated queries. Dolt even versions them for you!

For example, creating a view looks like this:

CREATE VIEW getnames
AS
SELECT first_name, last_name FROM contacts;

To use it simply do

SELECT * from getnames;

Secondary Indexes

Back in May 2020, Daylon implemented Secondary indexes. Indexes are essentially maps that allow for faster retrievals of certain keys.

Here's an example of a UNIQUE secondary index used so no duplicate emails exists.

CREATE TABLE `contacts` (
    id int PRIMARY KEY,
    first_name varchar(20),
    last_name varachar(20),
    email varchar(20) UNIQUE,
)

Window Functions

In February 2021, Zach implemented window functions. Window functions allow for more expressive computations on top of aggregations. The most popular one is ROW_NUMBER()

SELECT
  id,
  ROW_NUMBER() OVER w AS 'row_number',
FROM contact
WINDOW w AS (ORDER BY id);

Distinct Expressions for functions

Last month, I implemented DISTINCT uniqueness for functions. This syntax is useful for computations on de-duplicated data. For example, if we wanted to count all of the different first names we have, we can do

SELECT COUNT(DISTINCT *) FROM contacts

Conclusion

Our pursuit of 99% correctness has further validated that Dolt is a database. We will continue to ship more MySQL compliant features to hit 100% correctness along with much better performance. If you are curious about Dolt please join our discord here and learn more!

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.