The database is the heart of most applications. It's where the data that drives your web applications lives. It's where
your users' data is stored. The data in your database may be used in countless ways such as analytics, machine learning,
and reporting. It's important that your database is reliable, secure, and performant. Whether you choose a NoSQL store,
or a traditional relational database, many of the components you need to go to market with a production database are the
same. In this post we'll discuss some of the components you need to run a production database and how Hosted Dolt
can help you get to market faster.
If you decide to run this on your own hardware there will be additional steps to procure and setup the hardware, but we'll
be skipping over that and assuming a cloud deployment, though much of the information here will apply to on-premises deployments
The Components of a Production Database
The first component of a production database is the database itself. There are many different types of databases, but regardless
of which database software you choose you will need to set up hosts, install the software, configure it, and manage it. There
are a handful of cloud platforms which make it easy to get instances stood up where you want them. You could also use something
like kubernetes to manage your database instances.
With instances stood up, you'll need to install and configure the database software. This step will vary depending on which
database software you choose. Some databases can be installed using a package manager like apt, brew, etc. Others will require
you to download and build the software from source. See the documentation for your database to learn how to install it.
Now that you have instances running your database software, they need to be reachable from your application, and for whatever other
processes need to access the database. This means you need to configure networking. The steps vary here between cloud
providers, but in general you'll need a way to get a static address for your database, and a way to get connections routed
to that address. In AWS you will need to manage security groups and network access control lists. In GCP you will need
to manage firewall rules. In Azure you will need to manage network security groups. In kubernetes you will need to manage
ingress rules. In all cases you will need to manage DNS records as well.
Once you have your database instances stood up and reachable, you need to secure them. This means you need to configure
authentication and authorization. Additionally, you need to configure encryption for data in transit and at rest.
To set up encryption for data in transit you will need to configure TLS certificates, and you'll need to manage the rotation
of those certificates.
Now that you have your database stood up, reachable, and secured, you need to monitor it. You will want dashboards with
metrics that allow you to see what is going on with your database, and you'll want to be alerted when things go wrong.
Usually this means sending metrics to something like Prometheus or a cloud service like AWS's
CloudWatch. You'll also want to set up alerts to notify you when things go wrong.
For alerting, you'll need a service like PagerDuty, or OpsGenie
integrated with your metrics.
Whether you are debugging issues, or just want to see what's going on with your instance, you'll want access to your
database's logs. Often this means configuring your database instances to send logs to a central location the ELK stack,
or a cloud service like CloudWatch. Additionally, you'll need to configure log
rotation so the log files on your instances don't grow indefinitely.
Now that you have your database stood up, reachable, secured, monitored, and pushing logs to a central location, you need to handle irrecoverable
failures. This means you need to set up backups. You'll need to configure backups to run on a schedule, and you'll need
to configure retention policies for those backups. You'll also need to configure a way to restore from backups.
So far we've stood up a database and configured it to be reachable, secured, monitored, logged, backed up for disaster
recovery... but what about high availability? What happens if your database instance goes down? You don't want to have to
recover from a backup every time your cloud instance goes down. You need to set up replication. This means you need to do
everything we've already talked about above for each replica you want to have. You'll need to configure replication so that
your instances can speak with each other. It's possible you want to increase throughput by having read replicas, or you
may want to be able to promote one of your replicas to be the primary instance in the event of a failure. You'll need to
configure replication to handle these scenarios based on your needs.
You now have a database cluster setup. You've installed and configured numerous pieces of software, but how are you going
to upgrade each piece of software? Whose responsibility is it to make sure that the software is up-to-date, and that you
are not running vulnerable versions of software? You'll need to set up a manual or automated process to handle this.
Your database is finally ready to go. You've done everything you can to make sure it's reliable, but things will still go
wrong. You need to have a process in place to handle incidents, and you need people that can debug and resolve issues
when they arise. This means you'll need an on call loop and an escalation policy.
As you can see, there are countless steps involved in setting up a production database. It is a costly, and time-consuming
endeavor. It's also a distraction from your core business. You could spend months setting up your database, or you could
use a managed database service like Hosted Dolt, Amazon RDS, or
Cloud Bigtable and get to market faster.
Hosted Dolt handles all of the above for you. It's a relational database that you can
use like any other relational database, but it's also a versioned database that you can use like a Git repository. It's
a database that you can use to collaborate with your team, and track changes to your data over time.
Hosted Dolt is a product we are committed to and has a long roadmap of features and improvements.
In addition to all the requirements of a production database, we offer features that make it easy to collaborate with your
team such as the sql workbench, the ability to push and pull your database from DoltHub, and
pull requests. Hosted Dolt starts at $50 a month and provides a tremendous amount of value
for the price. If you are interested in learning more come talk to us on Discord, and
if you would like to try it visit https://hosted.doltdb.com to get started.