Relational Database Design: A Step-By-Step Guide

SQL
8 min read

Introduction

As an engineer working on Dolt, I spend most of my time working on the internals of a database system. Recently, I worked on using Dolt as a product to create a dataset that stored the entire Bitcoin blockchain's history. As I ambitiously ventured to create a 1Tb database, I realized I had made some foundational schema design flaws and had to start all over again. I came to the conclusion that although I work on a database product, I never closely studied schema design practices. This led me to Rod Stephen's excellent book on the foundations of relational database design. In this blog post, I'll share my learnings, detail some fundamental steps in relational database design, and walk through modeling an example problem. We'll also see how Dolt is an ideal tool to design schemas and track their changes with its versioning functionality.

Steps in Database Designs

Let's start by introducing a summary of the key concepts you need to keep in mind when designing your first relational database schema from scratch.

  1. Define the functionality of the database
  2. Determine the data needs
  3. Create the data models
  4. Establish model relationships
  5. Determine business rules

Our goal is to apply these steps to build a hotel room reservation system. Our hotel has a limited number of rooms and floors. Guest reserve rooms on specific floors. There are different types of rooms each with different prices.

Step 1: Define the functionality of the database

We've been given a fairly generic problem statement. We need to start asking some questions to hone in on the end-to-end functionality that is required. Here are a couple of examples:

  1. What is the user flow for a guest making a reservation?

A set of guests apply online to make a reservation. If the reservation is successful, one guest inputs her payment information and becomes a paying customer. When the user comes into the hotel, she checks in and confirms the reservation. Based on the customer's reservation, the customer is then allocated a room of a particular room-type.

  1. Do we need to track all reservations or only the approved reservations?

We want to track all reservations, even unsuccessful ones, as there are plenty of potential customers who never show up and check-in.

  1. What payment information do we need to collect for a paying customer?

For simplicity's sake, we will take a customer's credit card number and license number.

  1. Do we need to make it easy to query the names of the guests in a current room?

Yes, we should be able to quickly query all guests who are staying in a given room number.

  1. How do we know a room is currently occupied on a given day d?

We will define a room as occupied if there is a confirmed reservation where d sits in between the arrival and departure dates.

Step 2: Determine the data needs

  1. What are all the pieces of data that we need to collect to make this user flow happen?

For a guest we need to track her personal information including first name, last name, address, and phone number.

For a customer we need to track her payment information along with her relationship to the booking party.

For a reservation we need to track the paying customer, the type and location of the room, the booking period, and its confirmation status.

For a room we want to track its floor, number, type and, price.

We also want to know all types of a room (i.e one-bed, two-bed, loft).

  1. Where does the data come from?

We'll assume that data is coming from two applications. The first is an online reservation system for potential customers to use. The second is an internal application that hotel staff uses to check customers in and collect payment info.

Step 3. Creating a Data Model

We want to create a consistent data model that represents each object we described above. In the case of a relational database, each object will be represented by a table schema.

part 1

Let's go ahead and initialize a Dolt database and create the relevant tables. Each table needs to have a primary key, which uniquely identifies each record in the table. For example, each Guest has a unique id called GuestId that can be queried easily. We'll also need to assign the correct MySQL type to each parameter of the model that we want to store.

dolt init
dolt sql << SQL
CREATE TABLE RoomTypes(
	RoomType int PRIMARY KEY AUTO_INCREMENT,
	Description varchar(100)
);

CREATE TABLE Rooms(
	Floor int,
	RoomNumber int,
	RoomType int NOT NULL,
	Price decimal(10, 2),
	PRIMARY KEY(Floor, RoomNumber)
);

CREATE TABLE Guests(
	GuestId int PRIMARY KEY AUTO_INCREMENT,
	FirstName varchar(50) NOT NULL,
	LastName varchar(50) NOT NULL,
	Address varchar(100) NOT NULL,
	PhoneNumber varchar(20)
);

CREATE TABLE Customers(
	CustomerId int PRIMARY KEY AUTO_INCREMENT,
	GuestId int NOT NULL,
	CreditCardNumber varchar(20),
	LicenseNumber varchar(20)
);

CREATE TABLE Reservations(
	ReservationId int PRIMARY KEY AUTO_INCREMENT,
	CustomerId int NOT NULL,
	RoomType int NOT NULL,
	Floor int NOT NULL,
	RoomNumber int NOT NULL,
	ArrivalDate datetime NOT NULL,
	DepartureDate datetime NOT NULL,
	CheckInDate datetime,
	IsConfirmed tinyint
);
SQL

Let's create a commit to mark our progress

dolt add .
dolt commit -m "Added the first iteration of our data model"

Step 4. Establish Model Relationships

We must determine all possible relationships between the data models we have created. We are looking for two things:

  1. Foreign key relationships that protect the integrity of the system.
  2. Relationships that can be modeled as join tables.

These relationships depend on whether two data models have a 1:1, 1:many, or many:many relationship.

  • 1:1 - For each record A there can be at most one record B. For example, for each Guest there can be at most one Customer
  • 1:many - For each record A there can be many record Bs. For example, for each Customer there can be many Reservations.
  • many:many - Multiple records A can be related to multiple B records. There are no many:many relationships in the given example.

Understanding the above, let's nail down the relationships we want to model in our table

  1. Each Reservation can be associated with multiple guests. For example, an entire family will sign up with a reservation

For that relationship we will use a join table that relates a reservation to a set of guests:

CREATE TABLE GuestReservations(
    ReservationId int,
    GuestId int,
	PRIMARY KEY(ReservationId, GuestId)
);

We also want to ensure consistency with a foreign key relationship

ALTER TABLE GuestReservations ADD FOREIGN KEY (GuestId) REFERENCES Guests (GuestId);
ALTER TABLE GuestReservations ADD FOREIGN KEY (ReservationId) REFERENCES Reservations (ReservationId);
  1. Each Customer can reference at most one Guest.

We can model that with the following ALTER TABLE statement.

ALTER TABLE Customers ADD FOREIGN KEY (GuestId) REFERENCES Guests(GuestId);
  1. Each Reservation must be associated with a valid RoomType, Floor, and RoomNumber.
ALTER TABLE Reservations ADD FOREIGN KEY (CustomerId) REFERENCES Customers(CustomerId);
ALTER TABLE Reservations ADD FOREIGN KEY (Floor, RoomNumber) REFERENCES Rooms(Floor, RoomNumber);
ALTER TABLE Reservations ADD FOREIGN KEY (RoomType) REFERENCES RoomTypes(RoomType);
  1. Each Room can refer to only one Room Type.
ALTER TABLE Rooms ADD FOREIGN KEY (RoomType) REFERENCES RoomTypes(RoomType);

If we now run dolt diff we can get a clear summary of how the database's schema has evolved

diff --dolt a/Customers b/Customers
--- a/Customers @ 5mqhsekicqlokl5nccg6ka5e31dqnefr
+++ b/Customers @ 837ljfufbj23m9clqlue5goi9ql321td
 CREATE TABLE `Customers` (
   `CustomerId` int NOT NULL AUTO_INCREMENT,
   `GuestId` int NOT NULL,
   `CreditCardNumber` varchar(20),
   `LicenseNumber` varchar(20),
-  PRIMARY KEY (`CustomerId`)
+  PRIMARY KEY (`CustomerId`),
+  UNIQUE KEY `CustomerId` (`CustomerId`),
+  KEY `GuestId` (`GuestId`),
+  CONSTRAINT `stj5c9cp` FOREIGN KEY (`GuestId`) REFERENCES `Guests` (`GuestId`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_bin;
diff --dolt a/GuestReservations b/GuestReservations
added table
+CREATE TABLE `GuestReservations` (
+  `ReservationId` int NOT NULL,
+  `GuestId` int NOT NULL,
+  PRIMARY KEY (`ReservationId`,`GuestId`),
+  KEY `GuestId` (`GuestId`),
+  KEY `ReservationId` (`ReservationId`),
+  CONSTRAINT `gaep9bgm` FOREIGN KEY (`GuestId`) REFERENCES `Guests` (`GuestId`),
+  CONSTRAINT `hl54r6fq` FOREIGN KEY (`ReservationId`) REFERENCES `Reservations` (`ReservationId`)
+) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_bin;
diff --dolt a/Guests b/Guests
--- a/Guests @ l1ompvp8lm5t5gel8un16qi0jc3lf4kh
+++ b/Guests @ 6suupre2p74p76p20p7k9fpa7d6gj3rt
 CREATE TABLE `Guests` (
   `GuestId` int NOT NULL AUTO_INCREMENT,
   `FirstName` varchar(50) NOT NULL,
   `LastName` varchar(50) NOT NULL,
   `Address` varchar(100) NOT NULL,
   `PhoneNumber` varchar(20),
-  PRIMARY KEY (`GuestId`)
+  PRIMARY KEY (`GuestId`),
+  UNIQUE KEY `GuestId` (`GuestId`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_bin;
diff --dolt a/Reservations b/Reservations
--- a/Reservations @ 28ttobdgvef9o6179vd3mf5pbn3jitdl
+++ b/Reservations @ laddu17pt3ct2to748bn8m88a2uhsah1
 CREATE TABLE `Reservations` (
   `ReservationId` int NOT NULL AUTO_INCREMENT,
   `CustomerId` int,
   `RoomType` int NOT NULL,
   `Floor` int NOT NULL,
   `RoomNumber` int NOT NULL,
   `ArrivalDate` datetime,
   `DepartureDate` datetime,
   `CheckInDate` datetime,
   `IsConfirmed` tinyint,
-  PRIMARY KEY (`ReservationId`)
+  PRIMARY KEY (`ReservationId`),
+  KEY `CustomerId` (`CustomerId`),
+  KEY `FloorRoomNumber` (`Floor`,`RoomNumber`),
+  UNIQUE KEY `ReservationId` (`ReservationId`),
+  KEY `RoomType` (`RoomType`),
+  CONSTRAINT `9028rqts` FOREIGN KEY (`CustomerId`) REFERENCES `Customers` (`CustomerId`),
+  CONSTRAINT `s0sder9q` FOREIGN KEY (`Floor`,`RoomNumber`) REFERENCES `Rooms` (`Floor`,`RoomNumber`),
+  CONSTRAINT `unt7dnpj` FOREIGN KEY (`RoomType`) REFERENCES `RoomTypes` (`RoomType`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_bin;
diff --dolt a/RoomTypes b/RoomTypes
--- a/RoomTypes @ 15pn0vldri64dedtjfn5ogdtt3up58ra
+++ b/RoomTypes @ 2sudcu5uhmqtk4cvcdq6r1or7mnf4rj6
 CREATE TABLE `RoomTypes` (
   `RoomType` int NOT NULL AUTO_INCREMENT,
   `Description` varchar(100),
-  PRIMARY KEY (`RoomType`)
+  PRIMARY KEY (`RoomType`),
+  UNIQUE KEY `RoomType` (`RoomType`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_bin;
diff --dolt a/Rooms b/Rooms
--- a/Rooms @ 8pbbcjupduukcbb5mgr5dubpu93c7pno
+++ b/Rooms @ dadqn80bjri8mjufsj4atn7425jgt0k1
 CREATE TABLE `Rooms` (
   `Floor` int NOT NULL,
   `RoomNumber` int NOT NULL,
   `RoomType` int,
   `Price` decimal(10,2),
-  PRIMARY KEY (`Floor`,`RoomNumber`)
+  PRIMARY KEY (`Floor`,`RoomNumber`),
+  UNIQUE KEY `FloorRoomNumber` (`Floor`,`RoomNumber`),
+  KEY `RoomType` (`RoomType`),
+  CONSTRAINT `7k777s4a` FOREIGN KEY (`RoomType`) REFERENCES `RoomTypes` (`RoomType`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_bin;

Let's commit again to mark our progress.

dolt add .
dolt commit -m "Add Foreign Key Relationships"

part 2

Step 5: Determine Business Rules

Business rules represent essential assumptions or conditions a business operates with. For example, a hotel business would never set the price of a room to be less than 0. We can encode different types of business rules with SQL unique, check, not-null, and foreign key constraints. Given our above ER diagram, what are some business rules that we would want to enforce in our database?

  • There are max 20 floors in the hotel and 30 rooms per floor.
ALTER TABLE Reservations ADD CONSTRAINT CHECK (Floor >= 1 AND Floor <= 20);
ALTER TABLE Reservations ADD CONSTRAINT CHECK (RoomNumber >= 1 AND RoomNumber <= 30);
  • The price of a room must be greater than 0.
ALTER TABLE Rooms ADD CONSTRAINT CHECK (Price >= 0);
  • Nobody wants Room 13 on Floor 13
ALTER TABLE Reservations ADD CONSTRAINT CHECK (Floor != 13 AND RoomNumber != 13);

Finally, we can commit and save our newly added check constraints.

dolt add .
dolt commit -m "Added business rules with check constraints"

Conclusion

Hopefully, this lesson helped you with thinking through how to design your first relational database. As you become more of a database expert you'll start to learn about advanced concepts such as schema normalization, and secondary indexes. As you build out and iterate on your schema, you'll find that Dolt's versioning and diff properties make it easy to track your schema changes over time. If you ever make any mistakes you can always rollback your changes accordingly. If you want to learn more about databases, we highly recommend participating in our data bounties and joining our Discord.

Note: We want to give special thanks to Rod Stephens and his book Database Design Solutions for the example and some steps above. I highly recommend anyone working closely with databases give this book a read.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.