Using Dolt with the JetBrains DataGrip SQL Workbench

INTEGRATION
6 min read

Dolt has been rapidly expanding its capabilities as a SQL server recently. We've done a lot of work to get the dolt sql-server command to be a stable peer to the built-in SQL shell, with all the same capabilities. In the last month we've expanded the SQL server to support writes to the working set, as well as support for the @autocommit session variable to control transaction semantics in the server. And just earlier this week we demonstrated how to join tables in multiple repositories with the use of the newly introduced --multi-db-dir parameter to dolt sql and dolt sql-server. While it's still more common for our customers to interact with their databases via the built-in SQL shell or with the various import commands, we keep hearing from customers who want to use the sql-server command to connect their existing applications, and this is a very compelling feature of the tool for those who need it.

Today we're going to demonstrate using JetBrains' excellent DataGrip Database IDE to work with Dolt, and discuss what it took to achieve this. Here it is in action.

Dolt tables in DataGrip

DataGrip at a glance

DataGrip has a 30-day free trial, so you can download it at no cost to follow along with this post. Like most JetBrains products, all of DataGrip's features are available in other JetBrains IDEs as plugins. In our case, we're using GoLand with the bundled "Database Tools and SQL" plugin, which has all of the features of DataGrip.

GoLand plugins settings

On a side note, I use GoLand to develop Dolt and go-mysql-server, and I can't recommend it highly enough. It's well worth the license fee.

To get started connecting GoLand to Dolt, we need to start up the SQL server. I have a lot of dolt repositories in a common directory, so I'm going to start the server from that directory with the --multi-db-dir parameter to include all of them as databases:

% dolt sql-server --multi-db-dir ./ -l trace

I'm using the TRACE log level, which will cause all queries and results to be logged to the console. Leave it off if you don't want to see these.

Next, I can add a new database in the Database view of GoLand. As far as GoLand is concerned, we're connecting to a standard MySQL server, so I configure it exactly the same way.

New database dialog

That's it! Now I should be able to see all of my Dolt repositories as databases in the GoLand Database view:

GoLand database view

Double-clicking on a table in this view will load rows for that table in an editor. From there, I can make changes to the table just like editing a spreadsheet. Here I've edited the cases table of the coronavirus repository to change the value of the confirmed_count column for two rows:

Edited rows in table

If I commit these changes back to the database with Ctrl-Enter, I can immediately see them reflected on the dolt command line:

Diff in working set

Of course, this is Dolt, and Dolt is Git for Data. That means that any changes I make are only in my working set, and aren't committed to the repository until I add and commit the tables I've changed. And if I make a mistake and ruin some of my data, rolling back that change is as simple as checking out an earlier commit with dolt checkout. This ability to undo changes I've made, even after committing them to the database, gives me a ton of freedom to experiment and move quickly. I can always get back to a known good state with a couple commands, no matter how badly I screw up.

DataGrip has lots of great features for interacting with your database, and we've only scratched the surface here. But just being able to edit your tables with a fantastic graphical editor is very convenient.

Getting this to work: reverse-engineering the IDE

The journey to support DataGrip began when one our intrepid early adopters mentioned it wasn't working for him. At the time, we didn't have any plans to support SQL workbenches, but since a customer was asking for it we figured there must be some value there, and dug in. There were a lot of problems that prevented DataGrip from using Dolt.

SQL workbenches, like DataGrip or MySQL Workbench, really exercise your compliance with not only the SQL standard, but all the functions and system tables of the database you're emulating. In our case, we're declaring to DataGrip that we're a MySQL server, so DataGrip expects everything to work the same. In particular, it really wants the tables in the information_schema database to exist and have meaningful contents.

To figure out what DataGrip didn't like about Dolt, I enabled trace logging and connected the IDE to my server. This lets me see all the queries that the IDE is executing to see which ones it was choking on. Here is a sample of them:

 /* mysql-connector-java-8.0.15 (Revision: 79a4336f
 140499bd22dd07f02b708e163844e3d5) */SELECT  @@session.auto_increment_increment AS auto_increment_increment, @@chara
 cter_set_client AS character_set_client, @@character_set_connection AS character_set_connection, @@character_set_re
 sults AS character_set_results, @@character_set_server AS character_set_server, @@collation_server AS collation_ser
 ver, @@collation_connection AS collation_connection, @@init_connect AS init_connect, @@interactive_timeout AS inter
 active_timeout, @@license AS license, @@lower_case_table_names AS lower_case_table_names, @@max_allowed_packet AS m
 ax_allowed_packet, @@net_write_timeout AS net_write_timeout, @@query_cache_size AS query_cache_size, @@query_cache_
 type AS query_cache_type, @@sql_mode AS sql_mode, @@system_time_zone AS system_time_zone, @@time_zone AS time_zone,
  @@tx_isolation AS transaction_isolation, @@wait_timeout AS wait_timeout;
SET NAMES utf8mb4;
SET character_set_results = utf8;
SET autocommit=1;
SET sql_mode='STRICT_TRANS_TABLES';
select version(), @@version_comment, database();
/* ApplicationName=GoLand 2020.1.2 */ set session transaction read write;

So right off the bat, DataGrip begins by asking about and then setting a lot of session and system variables. This is so that it knows how to interpret the results being returned from queries, as well as what format the database will expect other queries in.

At this point in the query log, DataGrip had already run into several fatal errors talking to Dolt. Dolt didn't support the timezone variable correctly (it returned an unknown one), nor did it understand how to communicate information on its character sets or collations. It couldn't understand that the bare word utf-8 was a string (legal syntax for setting variables in the MySQL dialect). And the problems kept coming. This query alone gave Dolt like 9 different headaches:

select database(), schema(), left(user(),instr(concat(user(),'@'),'@')-1), @@version_comment;

The long nested function call is a clever way of getting the user name before first '@'. Unfortunately, Dolt didn't support any of the functions being used. So I had to implement them, not too hard. Then fire up the server again and find the next problem query. Pretty soon I ran into a thornier issue: Dolt's implementation of the information_schema database was very incomplete.

select table_name from information_schema.tables
    where table_schema collate utf8_general_ci = 'information_schema'
    and table_name collate utf8_general_ci = 'parameters';
select collation_name, character_set_name,
    is_default collate utf8_general_ci = 'Yes' as is_default\n
    from information_schema.collations;

This went on for many, many commits until I finally had something mostly working. It was a fascinating but maddening experience in reverse engineering.

Datagrip PR

We can now, tentatively, say that Dolt supports DataGrip, or at least the parts that we've tried ourselves. We are 100% certain that there are pieces missing or broken. And of course, because JetBrains are constantly updating and improving their IDEs, this is a moving target. As I sat down today to write this blog post, I realized that an update had broken the server, due to the new use of an information_schema table that wasn't implemented. So I had to fix it before I could begin writing. (It's worth noting that if you're reading this on the publication data, the latest DataGrip will have issues connecting to Dolt, unless you're building from source. That will be fixed in the next release, on Monday.)

Future work

In the long term, we are committed to making Dolt indistinguishable from MySQL for most applications, including any workbench you choose to connect to it. But in the short term, as we hope this blog makes clear, there are lots of gaps to fill. These gaps will vary between different workbenches, and we'll prioritize support for the ones our customers tell us they need. We're starting with DataGrip (beta) and MySQL Workbench (work in progress).

Do you have a favorite SQL workbench you need supported? File an issue and tell us about your use case! Dolt is and will always be open-source, and we welcome any issues and PRs to improve it.

Conclusion

We're very excited to offer support for the excellent DataGrip SQL IDE, and hope it brings a lot of value to our customers. We'll keep supporting DataGrip and an increasing number of SQL workbenches going forward. If yours isn't supported, tell us about it!

Download Dolt today to try it out yourself!

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.