Recovering Dropped Databases with dolt_undrop()

FEATURE RELEASE
6 min read

DoltDB is the world's first fully-versioned relational database. It gives you the versioning semantics of Git with all the power and expressiveness of SQL.

One of Dolt's greatest strengths is keeping your data safe. You can trace how every bit of your data has changed, who changed it, and exactly when it changed. You can efficiently diff your relational data at any two points in time and see exactly what changed. Did a buggy application update screw up some data? Just check out the history and revert the changes, or update join from Dolt's history to restore the data. Did an intern accidentally truncate a table!? No biggie, that's easy to fix with Dolt. Since all the historical versions of your data are accessible, Dolt makes it really, really hard to lose your data.

However... there was one exception to this that always bothered us. Since the commit graph for all your history is contained inside the database, dropping a database was an unrecoverable action. We even called it out in our README:

the only unrecoverable SQL statement in Dolt is drop database

We're excited to announce that we've changed the way drop database works and it's now possible to easily restore a dropped database by calling the new dolt_undrop() stored procedure. This new procedure makes it really easy to undo a potentially disastrous and painful mistake. When you really do want to permanently delete your dropped databases, for example to reclaim the disk space, you can use the new dolt_purge_dropped_databases() stored procedure.

These two new stored procedures are really easy to use. Let's take a look at a quick example...

dolt_undrop()

For this demo, we're going to start with an existing database from DoltHub that we'll clone locally. We'll drop the database, then show how it can be recovered, using the dolt_undrop() stored procedure.

If you don't already have dolt on your system, go ahead and install Dolt using the method of your choice. One of the great things about Dolt is that it ships as a single binary, so it's got a very simple and clean installation story.

Let's start off by cloning the dolthub/employees database from DoltHub so that we've got a sample database to play with. This is a test database that has a few tables that describe employees, departments, salaries, and titles at a fictitious company.

mkdir undrop_test && cd undrop_test
dolt clone dolthub/employees
cd employees

In the employees directory, let's take a quick look around the filesystem and see what files Dolt is managing:

ls -laht
total 0
drwxr-xr-x  6 jason  wheel   192B Oct  9 11:54 .dolt
drwxr-xr-x  3 jason  wheel    96B Oct  9 11:54 .
drwxr-xr-x  3 jason  wheel    96B Oct  9 11:54 ..

The .dolt directory is where all our data and version history is stored. If we look a little deeper, we see a few subdirectories:

find . -type d 
.
./.dolt
./.dolt/noms
./.dolt/noms/oldgen
./.dolt/temptf

Next, we're going to start up a Dolt SQL shell. We could also start up a Dolt SQL-server and connect via a compatible SQL tool like the mysql client or DataGrip, but it's super easy to use the SQL shell that's built into Dolt.

dolt sql

Now that we're in a SQL shell, let's take a look at what databases are present:

# Welcome to the DoltSQL shell.
# Statements must be terminated with ';'.
# "exit" or "quit" (or Ctrl-D) to exit.
employees> show databases;
+--------------------+
| Database           |
+--------------------+
| employees          |
| information_schema |
| mysql              |
+--------------------+
3 rows in set (0.00 sec)

We see the employees database that we just cloned, as well as the standard mysql and information_schema databases. If you haven't used them before, the mysql database tracks system level information, such as privileges, and the information_schema database contains lots of metadata about all the user databases, such as what indexes/tables/columns are present.

Before we drop the employees database, let's take a quick look at what tables exist:

employees> show tables;
+----------------------+
| Tables_in_employees  |
+----------------------+
| current_dept_emp     |
| departments          |
| dept_emp             |
| dept_emp_latest_date |
| dept_manager         |
| employees            |
| salaries             |
| titles               |
+----------------------+
8 rows in set (0.00 sec)

Now let's drop the employees database and see what happens!

employees> drop database employees;

All our employees are gone! Let's run show databases; again to confirm:

> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
+--------------------+
2 rows in set (0.00 sec)

The employees database is no longer accessible. If we try to use it or run a query against it, our statements will fail:

> use employees;
database not found: employees

Now let's try to get it back, using the dolt_undrop() stored procedure. All we have to do is supply the name of the dropped database:

> call dolt_undrop('employees');
+--------+
| status |
+--------+
| 0      |
+--------+
1 row in set (0.05 sec)

Success! Let's see if it shows up in show databases now:

> show databases;
+--------------------+
| Database           |
+--------------------+
| employees          |
| information_schema |
| mysql              |
+--------------------+
3 rows in set (0.00 sec)

Just to make sure our data is still there, let's use the employees database and see what tables are present:

> use employees;
Database changed
employees> show tables;
+----------------------+
| Tables_in_employees  |
+----------------------+
| current_dept_emp     |
| departments          |
| dept_emp             |
| dept_emp_latest_date |
| dept_manager         |
| employees            |
| salaries             |
| titles               |
+----------------------+
8 rows in set (0.00 sec)

Sure enough, those are the same tables as before. 🎉

When we restore a database with dolt_undrop() we also get our complete, versioned history restored, too. Let's check out the commit log by selecting from dolt_log and verify that it's the same commit log shown on DoltHub:

employees> select * from dolt_log limit 3;
+----------------------------------+-----------+-----------------+---------------------+-------------------------------------+
| commit_hash                      | committer | email           | date                | message                             |
+----------------------------------+-----------+-----------------+---------------------+-------------------------------------+
| l2dqemamag9oq28aeam6323sgc4317sj | timsehn   | tim@dolthub.com | 2023-02-02 22:49:26 | Initial import of employees test db |
| v5447vgsu4alnm8v701ogkfbcvs227qo | timsehn   | tim@dolthub.com | 2023-02-02 22:45:36 | Initialize data repository          |
+----------------------------------+-----------+-----------------+---------------------+-------------------------------------+
2 rows in set (0.00 sec)

Yup, our database and all of our versioned history is back!

dolt_purge_dropped_databases()

Now that we know how to restore a dropped database using dolt_undrop(), let's take a closer look at how Dolt is able to recover dropped databases and how to permanently delete a dropped database using dolt_purge_dropped_databases().

Let's go ahead and drop the employees database again, then exit the Dolt SQL shell:

employees> drop database employees;
> exit;

Now we're back in the employees directory that we previously cloned. Let's take a look around and see what's different.

ls -lal
total 8
drwxr-xr-x  4 jason  wheel  128 Oct  9 11:36 .
drwxr-xr-x  3 jason  wheel   96 Oct  9 11:11 ..
drwxr-xr-x  3 jason  wheel   96 Oct  9 11:36 .dolt_dropped_databases
-rw-r--r--  1 jason  wheel  269 Oct  9 11:36 .sqlhistory

Notice that the .dolt directory where Dolt stores all our database data and version history is gone! Instead, we have a new .dolt_dropped_databases directory. If we look inside that directory, we'll see that when we dropped our database, Dolt moved our data into an employees subdirectory there, instead of actually deleting it.

find . -type d
.
./.dolt_dropped_databases
./.dolt_dropped_databases/employees
./.dolt_dropped_databases/employees/.dolt
./.dolt_dropped_databases/employees/.dolt/noms
./.dolt_dropped_databases/employees/.dolt/temptf

Just for fun, let's take a look at how much disk space that's using:

du -sh * 
387M	.dolt_dropped_databases

So, now we know that when you use drop database, Dolt will move your database aside so that it can be restored with dolt_undrop(). How about when we want to permanently delete it so that we can reclaim the disk space? That's easy with the dolt_purge_dropped_databases() stored procedure. Let's try it out by using the -q flag on dolt sql to specify the query to run:

dolt sql -q "call dolt_purge_dropped_databases();"
+--------+
| status |
+--------+
| 0      |
+--------+

That return status code of zero means success, so our dropped database should be permanently removed now. We can test that by trying to run dolt_undrop():

dolt sql -q "call dolt_undrop('employees');"
error on line 1 for query call dolt_undrop('employees'): no database named 'employees' found to undrop. there are no databases currently available to be undropped

And finally, let's also check out our disk usage and make sure the .dolt_dropped_databases directory has been properly cleaned up:

du -sh * 
  0B	.dolt_dropped_databases

Boom! The .dolt_dropped_databases directory has zero bytes now, just like we expected after calling dolt_purge_dropped_databases().

Wrap Up

Thanks for reading about how we've made it possible to easily recover from drop database statements! This wasn't a particularly large change to the codebase, but we think it goes a long way to providing even more safety for Dolt customers. Dolt's versioning, diff, and data audit tools make Dolt one of the safest places for your most important data, and this new change to make it possible to recover from accidental drop database statements makes Dolt even safer.

If you have comments, questions, or feature request ideas, please swing by the DoltHub Discord and say hello! Our dev team hangs out on Discord all day while we're working and we're always happy to talk about databases, versioning, and data safety! 🤓

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.