Immutable databases offer history and audit to your database. Immutable databases are becoming a popular alternative to blockchain. No survey of the category exists. This blog attempts to explain what an immutable database is, why you might want to use one, and what products exist.
Furthermore, this article attempts to differentiate between three types of immutable databases: ledgers, blockchains, and version controlled databases.
What is an immutable database?
Immutable databases are append-only databases – data can only ever be added. If you want to change or delete an entry, a new entry is created and the old entry is marked as belonging to a previous version. Thus, all history of how a database became the present state is preserved. This makes auditing data changes possible because the audit log is tamper resistant.
In order to make this possible most immutable databases rely on some form of two technologies: cryptographic signatures and Merkle Directed Acyclic Graphs (DAGs). These technologies prevent data from changing without the database knowing and allow data to be shared between multiple versions efficiently and quickly accessed.
What did we used to do?
Before immutable databases, if you wanted to build a database with an audit log, you accomplished the task with schema. The term of art for this is Slowly Changing Dimension. This is not true immutability. You build a schema using columns that indicate inactive or active or "active from" to "active to" dates. Version number or timestamp columns on rows can also be used. Then, your application uses these columns to construct present and historical views of the data. Your application never deletes, just changes state from active to inactive. I don't think anyone would consider this type of database truly immutable as the application or an operator can still make a mistake and modify or delete a record. But from the application's perspective, you get an audit log and history of every record in the database.
There is also SQL "as of" syntax. This is in the SQL standard as of 2011 (pun intended) and supported by Oracle, Microsoft SQL Server, and MariaDB. With this syntax you can configure a table to be versioned and then query a version using
as of <timestamp>. Tables configured this way are called "temporal tables". Temporal tables generally comes with scale limitations, hence the need for a new category of database to handle this use case.
Ledgers vs Blockchains vs Version Controlled Databases
The immutable database is a relatively new category. Immutable databases are inspired by the technology and popularity of Blockchain databases like those used to power cryptocurrencies, Bitcoin and Ethereum.
Ledgers are simplified, centralized versions of blockchains: blockchains without consensus. Ledgers offer the immutability of a blockchain but for use cases where there is full trust in the writers. In this Is Dolt a Blockchain? article I make the case that decentralization and automated consensus are the main features that turn a ledger into a Blockchain.
A new category is emerging using similar technology. Version Controlled databases offer immutability along with a host of other features. Version controlled databases are decentralized – many copies of the database exist and can be written to independently. Version controlled databases offer branches, merges, and diffs to organize these writes.
For the rest of this article, we ignore any Blockchain solutions like Hyperledger Fabric or Ethereum. (EDIT: If you're interested in this category, we wrote a useful survey of the decentralized database category a couple months after this article was published.) The public blockchain space gets enough analysis and publicity, even Gartner is weighing in. We don't have much to add and we'd rather focus on Ledgers and Version Controlled Databases that don't get much written about them.
What are immutable databases used for?
Immutable database are useful for data that you want to secure and audit.
Amazon QLDB does a good job laying out the high level use cases for ledgers. Note the focus on security and auditability.
Version controlled databases can be used for all the things ledgers can be used for. Here's an article on how to use Dolt, a version controlled database, as a ledger.
Version controlled databases massively expand the use cases for immutable databases because of the expanded feature set. Version controlled databases can also be used for data sharing, data ingestion, machine learning, or other places where you want to treat data like source code. The additional branch and merge functionality is required to serve these these cases.
We will examine the following solutions:
This list is by no means complete. If we missed your non-blockchain immutable database, please reach out to me at firstname.lastname@example.org.
- Maintain an immutable, cryptographically verifiable log of data changes
- Initial Release
- September 2019
- Not Open Source
Amazon QLDB is the only hosted, non-open source product we'll discuss. QLDB supports a "document-oriented data model". The query language is PartiQL, a new open source document query engine. Accessing history is done through the built in history function, a PartiQL extension. To get an audit log of your database, you need to export the journal to an S3 bucket in JSON, text, or Amazon ION format.
AWS is a known quantity at this point. It's the cloud infrastructure leader. If you want a hosted ledger and are already in the AWS ecosystem, Amazon QLDB is for you. The user experience around auditability is a little lacking but the product is new so I'm sure it will get better over time.
- Open Source Immutable Database
- Initial Release
- May 2020
ImmuDB is an open source immutable database written in Golang. It can be run client-server or embedded. ImmuDB uses a subset of the PostgreSQL SQL dialect.
Accessing history is done through a set of history functions. In SQL, there is some custom syntax to access history.
Additionally, ImmuDB has a separate auditor to ensure your database has not been tampered with. This added layer of security would be useful in some use cases.
ImmuDB is a great, open source alternative to Amazon QLDB. If you are looking for a ledger you can run yourself or a ledger to embed in your application, check it out.
Version Controlled Databases
- Making Data Collaboration Easy
- Initial Release
- October 2019
TerminusDB is an immutable graph or document database with full schema and data versioning capability. The graph database interface is a custom query language called Web Object Query Language (WOQL). WOQL is schema optional. TerminusDB also has the option to query JSON directly, similar to MongoDB, giving users a more document database style interface.
The versioning syntax is exposed via TerminusDB Console or a command line interface. The versioning metaphors are similar to Git. You branch, push, and pull. See their how to documentation for more information.
TerminusDB is new but we like what we see. The company is very responsive, has an active Discord, and is well funded. If you think your immutable database needs database version control in graph or document form, check them out.
- Git for Data
- Initial Release
- August 2019
Dolt is an immutable database that leans heavily on Git version control metaphors and syntax. Dolt implements the Git command line and associated operations on table rows instead of files. Data and schema are modified in the working set using SQL. When you want to permanently store a version of the working set, you make a commit. In SQL, Dolt implements Git read operations (ie. diff, log) as system tables and write operations (ie. commit, merge) as functions. Dolt supports AS OF syntax using commits, making history queries user-friendly. Dolt produces cell-wise diffs and merges, making data debugging between versions tractable. That makes Dolt the only SQL database on the market that has branches and merges. You can run Dolt offline, treating data and schema like source code. Or you can run Dolt online, like you would PostgreSQL or MySQL.
We are biased but we think if you want an immutable SQL database with version control, there is only one product that fits that label and that's Dolt.