MySQL Information Schema Compatibility

7 min read

Dolt is a version controlled MySQL-compatible database, written from the ground up in Go. We're always working to close the remaining gap on full compatibility with MySQL. In this blog post, we're talking about the work we've done to make Dolt compatible with MySQL's information_schema tables.

The first time we worked on Information Schema support was almost 3 years ago when we added support for DataGrip. SQL editors like DataGrip run many queries on Information Schema to get necessary metadata to display in their UIs. The second major pass we took on Information Schema was for our First Hour Experience project. From that project, we learned that Information Schema plays an important role for connecting SQL editors and ORMs to a Dolt SQL server. Each editor and many ORMs require different information to be present and run queries on different tables.

Today, we want to share what we did for our third iteration on Information Schema support. For the first two passes, we focused on making SQL editors connect to Dolt SQL servers without any issues. This time, we wanted to go broad and add support for all the tables that we have information on. These tables are not necessarily used widely by SQL editors, but they are useful for various SQL tools and to enable customers to better explore the metadata of their databases. The goal was to improve currently supported tables that were missing some information and add all available information that Dolt has now to tables that were not supported before. This allowed us to double our coverage for Dolt's Information Schema support, roughly going from 30% up to 60%.

Information Schema Project

To tackle this project, we started by auditing the Information Schema tables to double-check each table's schema against the latest schema in MySQL. We grouped all the tables into a few different categories to prioritize some tables over others, since not all tables are equally important and some aren't even applicable to Dolt databases. For example, we chose not to support any of the InnoDB_ tables from MySQL server, since Dolt uses a different storage engine.

Our highest priority was filling in missing columns for tables that were already supported in Dolt, because we know those tables are commonly used by SQL editors and other SQL tools. These tables were implemented in our previous Information Schema projects, but some were missing columns or had data that didn't exactly match MySQL. After that, we focused on tables that expose data from features Dolt recently added support for but hadn't been added to Information Schema yet. For example, Dolt added support for users and privileges last year, and we needed to get that information populated in the corresponding Information Schema tables.

Next, there are tables for features that are not supported in Dolt. These tables need to be present, even if they're just empty, so that tools can execute queries against them. For example, Dolt doesn't support PARTITIONS or PROFILING data, but we needed the tables to exist because tools may assume they are present.

Last, but not least, as part of this work we also updated and expanded the Information Schema Support table in Dolt's documentation. You can find information on which tables are missing columns or not supported yet in the notes and limitations column.

Fun experiment using Dolt!

After auditing all the tables and adding more support, we did a fun experiment using Dolt's diff features to get the diff summary between MySQL and Dolt on Information Schema data based on MySQL's Sakila Sample Database. This allows us to measure our progress on making Dolt's Information Schema tables compatible with MySQL's support.

The database created from collecting the Information Schema data from MySQL and Dolt is available on DoltHub. Here are the results of a diff on tables that have some information before any table specific improvements:

> select table_name, rows_added, rows_deleted, rows_modified, cells_added, cells_deleted, cells_modified from dolt_diff_summary('mysql1','dolt0');
+---------------------------------------+------------+--------------+---------------+-------------+---------------+----------------+
| table_name                            | rows_added | rows_deleted | rows_modified | cells_added | cells_deleted | cells_modified |
+---------------------------------------+------------+--------------+---------------+-------------+---------------+----------------+
| applicable_roles                      | 0          | 1            | 0             | 0           | 9             | 0              |
| character_sets                        | 0          | 40           | 0             | 0           | 160           | 0              |
| check_constraints                     | 0          | 0            | 1             | 0           | 0             | 1              |
| collation_character_set_applicability | 0          | 14           | 0             | 0           | 28            | 0              |
| collations                            | 0          | 14           | 0             | 0           | 98            | 0              |
| column_privileges                     | 0          | 1            | 0             | 0           | 7             | 0              |
| column_statistics                     | 0          | 3            | 0             | 0           | 12            | 0              |
| columns                               | 7          | 42           | 94            | 154         | 924           | 249            |
| columns_extensions                    | 0          | 136          | 0             | 0           | 816           | 0              |
| engines                               | 0          | 8            | 0             | 0           | 48            | 0              |
| keywords                              | 0          | 747          | 0             | 0           | 1494          | 0              |
| parameters                            | 0          | 3            | 0             | 0           | 48            | 0              |
| partitions                            | 0          | 24           | 0             | 0           | 600           | 0              |
| plugins                               | 0          | 45           | 0             | 0           | 495           | 0              |
| processlist                           | 1          | 2            | 0             | 8           | 16            | 0              |
| referential_constraints               | 0          | 22           | 0             | 0           | 242           | 0              |
| routines                              | 0          | 0            | 1             | 0           | 0             | 8              |
| schema_privileges                     | 0          | 1            | 0             | 0           | 5             | 0              |
| schemata                              | 0          | 1            | 0             | 0           | 6             | 0              |
| schemata_extensions                   | 0          | 1            | 0             | 0           | 3             | 0              |
| st_geometry_columns                   | 0          | 1            | 0             | 0           | 7             | 0              |
| st_spatial_reference_systems          | 0          | 5152         | 0             | 0           | 30912         | 0              |
| st_units_of_measure                   | 0          | 47           | 0             | 0           | 188           | 0              |
| statistics                            | 0          | 0            | 21            | 0           | 0             | 21             |
| table_constraints_extensions          | 0          | 41           | 0             | 0           | 246           | 0              |
| table_privileges                      | 0          | 1            | 0             | 0           | 6             | 0              |
| tables                                | 0          | 24           | 0             | 0           | 504           | 0              |
| tables_extensions                     | 0          | 24           | 0             | 0           | 120           | 0              |
| tablespaces_extensions                | 0          | 92           | 0             | 0           | 184           | 0              |
| triggers                              | 0          | 0            | 6             | 0           | 0             | 32             |
| user_attributes                       | 0          | 5            | 0             | 0           | 15            | 0              |
| user_privileges                       | 0          | 149          | 0             | 0           | 596           | 0              |
| view_table_usage                      | 0          | 37           | 0             | 0           | 222           | 0              |
| views                                 | 0          | 0            | 7             | 0           | 0             | 27             |
+---------------------------------------+------------+--------------+---------------+-------------+---------------+----------------+

Here is the result of diff on tables that have some information after this project.

> select table_name, rows_added, rows_deleted, rows_modified, cells_added, cells_deleted, cells_modified from dolt_diff_summary('mysql1','dolt1');
+------------------------------+------------+--------------+---------------+-------------+---------------+----------------+
| table_name                   | rows_added | rows_deleted | rows_modified | cells_added | cells_deleted | cells_modified |
+------------------------------+------------+--------------+---------------+-------------+---------------+----------------+
| applicable_roles             | 0          | 1            | 0             | 0           | 9             | 0              |
| character_sets               | 0          | 40           | 0             | 0           | 160           | 0              |
| check_constraints            | 0          | 0            | 1             | 0           | 0             | 1              |
| column_privileges            | 0          | 1            | 0             | 0           | 7             | 0              |
| column_statistics            | 66         | 1            | 2             | 264         | 4             | 2              |
| columns                      | 7          | 42           | 94            | 154         | 924           | 114            |
| columns_extensions           | 0          | 42           | 0             | 0           | 252           | 0              |
| engines                      | 0          | 8            | 0             | 0           | 48            | 0              |
| partitions                   | 0          | 24           | 0             | 0           | 600           | 0              |
| plugins                      | 0          | 45           | 0             | 0           | 495           | 0              |
| processlist                  | 1          | 2            | 0             | 8           | 16            | 0              |
| routines                     | 0          | 0            | 1             | 0           | 0             | 7              |
| schema_privileges            | 1          | 1            | 0             | 5           | 5             | 0              |
| st_spatial_reference_systems | 0          | 5150         | 0             | 0           | 30900         | 0              |
| statistics                   | 0          | 0            | 45            | 0           | 0             | 45             |
| table_constraints_extensions | 21         | 20           | 0             | 126         | 120           | 0              |
| table_privileges             | 1          | 1            | 0             | 6           | 6             | 0              |
| tables                       | 0          | 0            | 24            | 0           | 0             | 91             |
| tables_extensions            | 0          | 7            | 0             | 0           | 35            | 0              |
| tablespaces_extensions       | 0          | 92           | 0             | 0           | 184           | 0              |
| triggers                     | 0          | 0            | 6             | 0           | 0             | 32             |
| user_attributes              | 1          | 4            | 0             | 3           | 12            | 0              |
| user_privileges              | 30         | 119          | 0             | 120         | 476           | 0              |
| view_table_usage             | 0          | 37           | 0             | 0           | 222           | 0              |
| views                        | 0          | 0            | 7             | 0           | 0             | 14             |
+------------------------------+------------+--------------+---------------+-------------+---------------+----------------+

It looks like there is still a big gap between Dolt and MySQL from this diff result, but a lot of these are either tables that are not supported or differences on create and update time and sql mode which could not have the same values. The rest of the tables are simply missing few columns which are noted on our documentation page.

A very simple analysis on these diff results would be on the number of tables that we do not have any diff on. This means there were 14 tables from before and 23 tables after improvements that did not have any diff. This shows us that our support increased from roughly 30% to 50%, which might not be the best representation of the progress.

Another way we can get progress analysis is to use the cells changed information from dolt_diff_summary() table function. There are total of 44,436 cells in the MySQL Information Schema data on the Sakila Database. However, from those we are excluding a few tables that represent features Dolt doesn't support yet. These include unsupported spatial reference systems, partitions, plugins, and tablespaces extension information. This makes the total number of cells across all tables 12,257. The total number of cells that are different than MySQL across all tables is 6,332 and 3,288 for Dolt before and after the updates, respectively. This tells us that the overall compatibility of Dolt and MySQL's Information Schema data (for the DB features Dolt supports) went from 48.4% up to 73.2%.

Conclusion

Here is the list of Information Schema tables and their support level from before and after these latest Information Schema improvements. As you can see, we've made a lot of progress on these tables, but there's still more to fill in.

❌ - not supported; empty

🟠 - partially supported

✅ - fully supported

We can get some progress analysis on this table as well. If we count unsupported as 0 point, partially supported as 1 point and fully supported as 2 points, and there are 48 tables. We increased our Information Schema support from 30% to 60% so far. This reflects pretty close to the analysis we did on cell changes above.

+---------------------------------------+--------+-------+
| Table                                 | Before | After |
+---------------------------------------+--------+-------+
| ADMINISTRABLE_ROLE_AUTHORIZATIONS     | ❌     | ❌    |
| APPLICABLE_ROLES                      | ❌     | ❌    |
| CHARACTER_SETS                        | ✅     | ✅    |
| CHECK_CONSTRAINTS                     | ✅     | ✅    |
| COLLATION_CHARACTER_SET_APPLICABILITY | ✅     | ✅    |
| COLLATIONS                            | ✅     | ✅    |
| COLUMN_PRIVILEGES                     | ❌     | 🟠️    |
| COLUMN_STATISTICS                     | ❌     | 🟠️    |
| COLUMNS                               | 🟠️     | 🟠️    |
| COLUMNS_EXTENSIONS                    | 🟠     | 🟠    |
| ENABLED_ROLES                         | ❌     | ❌    |
| ENGINES                               | ✅     | ✅    |
| EVENTS                                | ❌     | ❌    |
| FILES                                 | ❌     | ❌    |
| KEY_COLUMN_USAGE                      | ✅     | ✅    |
| KEYWORDS                              | ❌     | ✅    |
| OPTIMIZER_TRACE                       | ❌     | ❌    |
| PARAMETERS                            | ❌     | ✅    |
| PARTITIONS                            | ❌     | ❌    |
| PLUGINS                               | 🟠️     | 🟠️    |
| PROCESSLIST                           | ❌     | ✅    |
| PROFILING                             | ❌     | ❌    |
| REFERENTIAL_CONSTRAINTS               | ❌     | ✅    |
| RESOURCE_GROUPS                       | ❌     | ❌    |
| ROLE_COLUMN_GRANTS                    | ❌     | 🟠    |
| ROLE_ROUTINE_GRANTS                   | ❌     | ❌    |
| ROLE_TABLE_GRANTS                     | ❌     | ❌    |
| ROUTINES                              | 🟠️     | 🟠️    |
| SCHEMA_PRIVILEGES                     | ❌     | ✅    |
| SCHEMATA                              | ✅     | ✅    |
| SCHEMATA_EXTENSIONS                   | ❌     | ✅    |
| ST_GEOMETRY_COLUMNS                   | ❌     | ✅    |
| ST_SPATIAL_REFERENCE_SYSTEMS          | ❌     | ✅    |
| ST_UNITS_OF_MEASURE                   | ❌     | ✅    |
| STATISTICS                            | ✅     | 🟠    |
| TABLE_CONSTRAINTS                     | ✅     | ✅    |
| TABLE_CONSTRAINTS_EXTENSIONS          | ❌     | ✅    |
| TABLE_PRIVILEGES                      | ❌     | ✅    |
| TABLES                                | 🟠️     | 🟠️    |
| TABLES_EXTENSIONS                     | ❌     | ✅    |
| TABLESPACES                           | ✅     | ✅    |
| TABLESPACES_EXTENSIONS                | ❌     | ❌    |
| TRIGGERS                              | ✅     | ✅    |
| USER_ATTRIBUTES                       | ❌     | 🟠    |
| USER_PRIVILEGES                       | ❌     | ✅    |
| VIEW_ROUTINE_USAGE                    | ❌     | ❌    |
| VIEW_TABLE_USAGE                      | ❌     | ❌    |
| VIEWS                                 | ✅     | 🟠    |
+---------------------------------------+--------+-------+

If you need any specific tables or columns supported for your use case, we are always happy to get it sorted out and implement them for you. Any suggestions or bug reports are always welcome, too. We use GitHub Issues for user bug reports and track them as they get resolved. If you are interested in learning more about Dolt, we are always available to answer any questions on our Discord channel.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.