Announcing ROW() Constructor Support in Doltgres
We've been working hard building more features for Doltgres, our Postgres-compatible version-controlled database. Just last month we hit a major milestone and officially moved Doltgres into Beta. 🎉 We still have a lot more features to build, and we're making steady progress. We're also delighted that we're starting to get some customer feedback and feature requests on GitHub. In this blog post, we're announcing initial support for the ROW()
constructor syntax, which came in as a feature request from a customer on GitHub. If there are Postgres features you want us to prioritize in Doltgres, please give us a shout on GitHub and let us know – we love getting feature requests from customers to help us decide what to build next.
🚀 Announcing ROW()
Constructor Support in Doltgres
We're excited to announce that Doltgres now supports using the ROW()
constructor syntax to create anonymous records. This feature improves Doltgres' compatibility with PostgreSQL and opens up new use cases for working with composite data. In the sections below, we'll explore the new support, see ways you can use the ROW()
constructor, and also talk about a few gaps in the initial implementation that aren't supported yet.
What is the ROW()
constructor?
From the official Postgres docs:
A row constructor is an expression that builds a row value (also called a composite value) using values for its member fields. A row constructor consists of the key word
ROW
, a left parenthesis, zero or more expressions (separated by commas) for the row field values, and finally a right parenthesis.
Here's a quick example:
postgres=# SELECT ROW(1,2.5,'this is a test');
row
--------------------------
(1,2.5,"this is a test")
(1 row)
The returned value is an anonymous record instance, containing the literals we specified. Records are a schema-less composite type that allows you to work with row-like values whose column types and count may not be fixed in advance. In fact, records are the first composite type that Doltgres supports. If you'd like us to prioritize support for user-defined composite types, feel free to comment on this tracking issue and let us know what you need.
What can you do with a record?
Creating records
The most basic thing you can do with records is... create them! Here are some examples of using the ROW()
constructor to create records that are supported in Doltgres.
SELECT ROW(1, 2, 3) as myRow;
This simple example creates a record with three numeric values:
myrow
---------
(1,2,3)
(1 row)
Note that the ROW
keyword is actually optional, as long as you specify multiple values in the parentheses:
SELECT (4, 5, 6) as anotherRow;
This yields:
anotherrow
------------
(4,5,6)
(1 row)
NULL values are fair game in records:
SELECT (NULL, 'foo', NULL) as myRow;
Note that they are encoded on output as empty values:
myrow
---------
(,foo,)
(1 row)
Records may also include the results of other expressions:
SELECT ((1+2+3+4+5), (1 > 0), (select max(pk) from users)) as r;
This returns:
r
----------
(15,t,1)
(1 row)
Records can also contain other records:
SELECT ROW(ROW(1, 'x'), true) as nestedRecords;
This returns a record with two fields: the first is another record, and the second field is a boolean value:
nestedrecords
---------------
("(1,x)",t)
(1 row)
Comparing records
Once you have a record, one of the most common things to do with it is compare it to another record. This is useful for custom sorting, conditional checks, and equality semantics involving multiple fields. The comparison rules for records require that two records being compared have the same number of fields. Comparisons start by comparing the left most field of the two records and will continue comparing fields until they reach a definitive result.
An easy example to start off with:
SELECT ROW(1, 2) < ROW(2, 3) as comparison;
In this example, the comparison starts by comparing the left-most field of each record: 1 < 2
. In this case, the very first field tells us definitively that the first record is less than the second record, without having to look at any other fields in the records. Doltgres returns:
comparison
------------
true
(1 row)
Instead, if we run:
SELECT ROW(2, 2) < ROW(2, 3) as comparison;
Again, the leftmost field in each record is compared with 2 < 2
and we see that the less than comparison is not true
, however, because these two field values are equal, Doltgres continues looking at additional fields, always working from leftmost to rightmost. Because the next two fields return true (2 < 3
), Doltgres definitively knows that the first record is indeed less than the second record and returns:
comparison
------------
true
(1 row)
However, if those first two field values were NOT equal and NOT less than (e.g. 4 < 2
), then Doltgres would immediately know that the first record is not less than the second record and would return false
without having to look at any other fields.
How about cases where there is not a definitive answer? Here's an easy one:
SELECT ROW(NULL, NULL, NULL) = ROW(NULL, NULL, NULL) as nullComparison;
At first glance, one might assume that these two records are equal, since they each contain three NULL
values, however, remember that in SQL NULL
means the value is unknown, and NOT just an empty value. Therefore, there is not enough information in the records to determine if they really are equal (or less than, greater than, etc.). In this case, Doltgres returns NULL
as the comparison result because it is unknown whether the records actually are equal:
nullcomparison
----------------
(1 row)
The rules for NULL
values in records can be slightly tricky, but as long as you remember that NULL
means the absence of a definitive value (and not just "no value"), then they'll make sense. Here's another example:
SELECT ROW(2, 1) > ROW(1, NULL) as comp;
In the example above, the NULL
value never even needs to be compared. The first comparison of 2 > 1
is true, so Doltgres immediately knows that the first record is greater than the second record.
How about this one?
SELECT ROW(NULL, 1) >= ROW(2, 1) as comp;
Here, as before, we start by comparing the leftmost field of each record. The result of NULL >= 2
is undefined, so immediately Doltgres knows that we won't be able to determine if the first record is greater than or equal to the second record – there's just too much unknown since the most significant field of the records has an undefined comparison result. In this case, Doltgres also returns a NULL
value for the comparison:
comp
------
(1 row)
When type coercion is required to compare field values, Doltgres follows the same rules as Postgres. In the example below, Doltgres will try to compare 2 = 'two'
and because 'two'
cannot be turned into a numeric type, it will return the same error as Postgres: invalid input syntax for type int4: "two"
.
SELECT ROW(1, 2) = ROW(1, 'two');
When a value can be coerced to the reuqired type, such as with SELECT ROW(1, 2) = ROW(1, '2');
, Doltgres will coerce the type and in this case, return:
(1, 2) = (1, '2')
-------------------
true
(1 row)
Filtering results
Similar to the record comparisons above, Doltgres also supports using records in WHERE
and HAVING
clauses to filter results.
The example below shows the use of ROW
expressions in a WHERE
clause to filter results. This simple case could also be written as WHERE name = 'John' AND email = 'j@a.com'
, but you may find the ROW
expressions easier to read, especially in larger, more complex queries.
SELECT id FROM users WHERE ROW(name, email) = ROW('John', 'j@a.com');
Running that query correctly returns the ID for that user:
id
----
1
(1 row)
This next example demonstrates the use of ROW
expressions in a HAVING
clause and also shows a case where using ROW
expressions makes the query much more succinct. In this example, we only want rows where name
is "John" and email
is 'j@a.com
OR where name
is "Jane" and email
is "jane@a.com". If we had just filtered on HAVING name IN ('John', 'Jane') AND email IN ('j@a.com', 'jane@a.com')
then we could potentially get a result where name
is John
and email
is jane@a.com
, which is not the same results as the query with ROW
expressions. The same query could be written without ROW
expressions of course, but would be more verbose, especially if the filtering conditions become more complex.
SELECT ROW(name, email) as row FROM users
HAVING row IN (ROW('John', 'j@a.com'), ROW('Jane', 'jane@a.com'));
This query returns:
row
----------------
(John,j@a.com)
(1 row)
What doesn't work yet?
Our initial support for ROW()
and the record
type still has a few gaps where you can't yet do things in Doltgres that you can do in Postgres. We'll be filling in these gaps over time, but if there's a feature you need, just let us know on GitHub and we'll be happy to move that feature to the front of the line for you. The sections below describe a few gaps we know about and are planning on implementing.
Wrapping table rows
In Postgres, you can pass in a table alias to the ROW()
constructor to turn the rows in that table into a list of records. For example:
select row(p) from users p;
This will return one column with a record type, with one row for every row in the table users
:
row
-------------------------
("(1,John,j@a.com)")
("(2,Joe,joe@joe.com)")
(2 rows)
Note that we see the same data from the table users
, but the individual columns from users
are now packaged up as a single record in each row.
When used this way, one convenient way to access fields in each record is:
SELECT (u).name FROM users u;
This will extract the name
field from each record and return a result set with just the name column from the users
table:
name
------
John
Joe
(2 rows)
User-defined functions using records
A common use of records is to return multiple values from a user-defined function, as shown in this very simple example:
CREATE FUNCTION get_user_info(integer) RETURNS record AS $$
BEGIN
RETURN ROW('Alice', 'alice@example.com', 42);
END;
$$ LANGUAGE plpgsql;
The example above creates a record using the ROW()
constructor, but currently Doltgres doesn't support returning record types, so trying to create this user-defined function will return an error. If you need this support, let us know and we'll be happy to prioritize it for you.
Once support for this is implemented, you'll be able to call this function just as in Postgres, for example: select get_user_info(100);
and get back the result:
get_user_info
------------------------------
(Alice,alice@example.com,42)
(1 row)
Related to returning a record, Doltgres will also support declaring variables of type record
in PL/pgSQL functions.
Additional record comparisons
There are still two record comparisons that we know we need to implement. The first is support for IS NOT DISTINCT FROM
, such as in this example:
SELECT ROW(1, NULL) IS NOT DISTINCT FROM ROW(1, NULL);
In Postgres, this returns: true
. IS NOT DISTINCT FROM
works as a NULL-safe comparison operator where NULL
is considered equal to NULL
. We don't currently support any use of IS NOT DISTINCT FROM
in Doltgres. This is likely one we'll need to implement soon.
Another gap with comparisons is support for IS NULL
. This is one we're currently looking into fixing, since testing for NULL
is such a common operation. In Postgres, the IS NULL
operator returns true
when all elements of a record are NULL
values. For example:
SELECT ROW(NULL, NULL) IS NULL;
Since all fields in the record created by the ROW()
expression above are NULL
, Postgres returns true
for this expression. To fix this in Doltgres, Doltgres needs to let the SQL engine know that testing a record with the IS NULL
means to look at the fields within the record, not just if the record itself is NULL
. Because the record
type is declared at a higher level than the SQL engine, it's a slightly more involved change than simply having the SQL engine look for records and then iterate over the fields.
Final Thoughts
There is a lot of surface area in Postgres' feature set and SQL syntax, and we're making steady progress getting Doltgres to work with more of these features. With the addition of support for the ROW()
constructor syntax and support for the record
type, we've made another big step towards this goal. We prioritized support for these features from a customer request on GitHub, so if you have any features you want to use that aren't supported yet, just cut us an issue on GitHub and we'll be more than happy to dig in and work on them.
If you're curious about data versioning, or just want to learn more about Doltgres, come join us on Discord, where our dev team and our customers hang out, and we'll be happy to chat.