Improving Stored Procedure Support
Here at DoltHub, our centerpiece is Dolt, which fuses a MySQL-compatible database with Git-style versioning capabilities. After you install Dolt, all it takes are a few commands to have a running server:
mkdir demo cd demo dolt init dolt sql-server
Your MySQL-backed applications will require no change, as we aim to be completely compatible with MySQL. However, you now have access to the full power that versioning provides, and as Git is the most popular versioning software ever, you can apply that same knowledge to Dolt; there is practically zero learning curve.
We're continually working on our MySQL-compatibility story, as we're building our own implementation from the ground up. In the case of stored procedures, that means that we're building our own interpreter that operates on the syntax of stored procedures, which may be viewed as a scripting language for SQL databases. As Dolt matures, our support for stored procedures grows ever closer to completion, and we've recently added a few new exciting additions: variables, loops, and cursors!
What are stored procedures?
Last year, we released the first iteration of stored procedure support, but the blog didn't quite explain what stored procedures are, and why they're useful. I mentioned it in the first section, but stored procedure syntax may be viewed as a scripting language, extending the standard SQL syntax by adding explicit variables and control flow (just like you'd expect to find in any scripting language). This, in turn, creates a sort of script that lives on the database, allowing clients to execute these scripts and work with the resulting data. Without stored procedures, clients would need to execute the necessary queries manually, which would generally result in an application being designed to specifically handle a particular workflow.
This is one advantage of stored procedures, as it allows users to move their database-interfacing logic to a script on the database, enforcing a separation-of-concerns. Another benefit of stored procedures deals with the communication between the database and the client. The database can access its local data far quicker than it can send it to the client, so doing specific tasks over large datasets are exponentially faster when the proper stored procedures are created.
Another rather large benefit comes from the database's security model. Stored procedures may specify that they execute under a specific user's credentials, meaning the stored procedure has access to portions of the database that a client's user may not. This is not something that can be replicated by externally building a workflow.
The last benefit that I'll mention is granted by Dolt's marriage of SQL and Git. Naturally, when a table's schema is updated, any affected stored procedures are also updated. In Dolt, this translates to versioning your stored procedure logic along with the relevant schema and data. This greatly aids in cases where reproducibility is paramount, which is far more difficult when having to sync the state of an external application with a database.
So what's new for Dolt?
With the first iteration of stored procedures, we only implemented the very basics:
- Control flow via
This new release greatly expands that functionality by adding:
- Local variables
LOOP/LEAVE/ITERATEfor control flow
- Partial support for handlers
I'm sure some will be shocked that stored procedures were ever released without local variables, but using parameters as variables worked for the majority of our customers in the meantime.
You may have spotted it, but the common theme with the added functionality is proper scope resolution.
Using only parameters meant that all variables were global (including user variables using the
By adding local variables, we also had to resolve each usage of a variable to the proper scope, as variable names may shadow other variables.
This also lead to adding support for cursors, since they're generally used with local variables, and
LOOP so that one may continually pull data from a cursor.
Let's go over each of these additions with examples.
I assume a bit of familiarity with SQL in the examples, as stored procedures are considered an advanced use case. In the previous blog post, I go through a couple of simpler stored procedures. These examples also assume that you've connected to a running Dolt server via a client (I'm using MySQL's official client since it works great with Dolt). You can run a Dolt server by following the small example at the very beginning of the blog.
In this example, we'll declare a few local variables.
We'll also declare a variable with the same name but in an inner scope to show variable shadowing.
First thing you'll notice the
When using a client, by default statements are assumed to end when a semicolon
; is reached.
This isn't quite true for stored procedures, so we change the statement terminator (called the
DELIMITER) to something that won't be seen within the stored procedure.
This can be set to anything, and I'm using
// for personal taste.
mysql> DELIMITER // mysql> CREATE PROCEDURE example() BEGIN DECLARE a, b, c INT DEFAULT 0; SELECT 1, 2, 3 INTO a, b, c; BEGIN DECLARE a INT; SET a = 4; END; SELECT a + b + c; END// Query OK, 0 rows affected (0.00 sec) mysql> DELIMITER ; mysql> CALL example(); +---------------+ | ((a + b) + c) | +---------------+ | 6 | +---------------+ 1 row in set (0.00 sec)
We declare three variables:
We then assign them
Within a new
BEGIN/END block, we declare a new variable named
This new block is also a new scope, so the new
a shadows the previous
As soon as we leave the scope, we're back to the original
a, and therefore the last
SELECT statement adds
1 + 2 + 3.
Cursors are a great way of incrementally iterating over table data.
They execute a
SELECT statement, and return the result one row at a time.
In this example, we'll show how to declare a cursor, how to open and close them, and also we'll use the cursor to fetch two rows from a table.
mysql> CREATE TABLE example_table (pk BIGINT PRIMARY KEY, v1 BIGINT); Query OK, 0 rows affected (0.00 sec) mysql> INSERT INTO example_table VALUES (1, 2), (4, 7); Query OK, 2 rows affected (0.00 sec) mysql> DELIMITER // mysql> CREATE PROCEDURE example() BEGIN DECLARE temp_a, temp_b BIGINT DEFAULT 0; DECLARE a, b BIGINT DEFAULT 0; DECLARE cur CURSOR FOR SELECT pk, v1 FROM example_table; OPEN cur; FETCH cur INTO temp_a, temp_b; SELECT a + temp_a, b + temp_b INTO a, b; FETCH cur INTO temp_a, temp_b; SELECT a + temp_a, b + temp_b INTO a, b; CLOSE cur; SELECT a, b; END// Query OK, 0 rows affected (0.00 sec) mysql> DELIMITER ; mysql> CALL example(); +------+------+ | a | b | +------+------+ | 5 | 9 | +------+------+ 1 row in set (0.00 sec)
We fetch our table's rows into
temp_b, and then add them to
This gives us
a = 1 + 4 = 5 and
b = 2 + 7 = 9.
Loops are a great way to repeat some logic.
Without using handlers, loops should be assigned labels so that
ITERATE may control its execution.
LEAVE causes the loop to terminate, while
ITERATE skips all remaining logic in the loop and starts the loop over.
mysql> DELIMITER // mysql> CREATE PROCEDURE example() BEGIN DECLARE a BIGINT DEFAULT 0; loop_name: LOOP SET a = a + 1; IF a < 10 THEN ITERATE loop_name; ELSE LEAVE loop_name; END IF; SET a = a * 1000; END LOOP; SELECT a; END// Query OK, 0 rows affected (0.00 sec) mysql> DELIMITER ; mysql> CALL example(); +------+ | a | +------+ | 10 | +------+ 1 row in set (0.00 sec)
SET a = a * 1000; is never reached, as
ITERATE starts the loop over, and
LEAVE exits the loop.
Be careful when using loops, as it is possible to create an infinite loop that never terminates.
Handlers define an action to take when their condition is reached.
They also provide a section to execute additional code, which is generally used for cleanup.
There are different kinds of conditions, but we only support the
NOT FOUND variant for now.
This is enough to allow looping over a cursor, which is what we'll do in this example.
mysql> CREATE TABLE example_table (pk BIGINT PRIMARY KEY); Query OK, 0 rows affected (0.00 sec) mysql> INSERT INTO example_table VALUES (1), (2), (3), (4), (5), (6), (7), (8), (9); Query OK, 9 rows affected (0.00 sec) mysql> DELIMITER // mysql> CREATE PROCEDURE example(OUT param BIGINT) BEGIN DECLARE temp BIGINT DEFAULT 0; DECLARE cur CURSOR FOR SELECT pk FROM example_table; DECLARE EXIT HANDLER FOR NOT FOUND SET param = param * 100; SET param = 0; OPEN cur; LOOP FETCH cur INTO temp; SET param = param + temp; END LOOP; END// Query OK, 0 rows affected (0.00 sec) mysql> DELIMITER ; mysql> CALL example(@param); Query OK, 0 rows affected (0.00 sec) mysql> SELECT @param; +--------+ | @param | +--------+ | 4500 | +--------+ 1 row in set (0.00 sec)
You'll notice that our loop is unlabeled.
This is because we rely on the handler to exit the
BEGIN/END block as soon as the
NOT FOUND condition is raised, which
FETCH raises when it exhausts
example_table of rows.
In addition, we do not explicitly close our cursor.
It is not necessary to close all cursors before a stored procedure ends, but it is good practice.
SET param = param * 100; executes once the handler sees the
NOT FOUND condition, which is why our parameter has a value of
4500 rather than just
Stored procedures are getting quite close to having full feature-parity with MySQL! We hope that you'll join us on the ride as we continue to work on Dolt. You can keep up to date with us through Twitter, or you can chat directly with us through Discord.