Transferring Data In and Out of Air-Gapped Networks

USE CASE
8 min read

A high security network can have what is called an "air gap". An air-gapped network is a network where there are no physical connections to other networks. To get data in and out of an air-gapped network, someone physically must bring the data across the air gap, connect to the network, and perform a data transfer or synchronization operation.

Anyone who has worked with an air-gapped network knows that while secure, an air gapped network is a pain to maintain and operate. Any time you want to get software or data in or out, someone must physically cross the gap. If multiple people are updating things in the network, it's hard to know if there are conflicting updates. Moreover, the network is air-gapped because the software and data you have inside is sensitive, so you don't want to maintain much information about the network outside the air gap.

Pictured below is a highly skilled expert crossing an air gap. Seems difficult.

Mt. Baker Road Gap

Fortunately, we have new decentralized version control tools like Git for files and Dolt for databases that can help getting data in and out of air-gapped networks. This blog will explain what an air-gapped network is, where it is generally deployed, and some tools to make managing the software and data inside it easier.

What is an Air Gapped Network?

An air-gapped computer or network is one that has no network interfaces, either wired or wireless, connected to outside networks. Pictured below, you have a standard internet-accessible network on the left. On the right, you have an air-gapped network. Note, no internet connectivity and you have enhanced physical security to move data in or out.

Air Gapped Network Topology

Obviously, in an air gapped network you must remove wifi or other airborne network connectivity from devices in the air-gapped network. Most hacks of air gapped environments involve getting data out via some sort of over-the-air means.

Remove Wifi

To move data between the outside world and the air-gapped system, it is necessary to write data to a physical medium such as a laptop or removable drive, and physically move the data in and out of the network. The less sophisticated the device used to transfer the data, the higher the security. In some high security air-gapped settings, data transfer via any electronic medium is prohibited. Only manual data entry is permitted in the air-gapped network. This is usually complimented by enhanced physical security preventing any unwanted electronics crossing the air gap.

Physical Security

At the network layer a Unidirectional Gateway or data diode can be used to allow one way data transfer, either in or out. A network design in this way has a one way air gap.

Benefits

The main benefit of an air-gapped network is security. The threat profile of an air-gapped network is much smaller and simpler than a network connected to the internet. The network is more observable given traffic is only generated in the network itself. An air-gapped network is also far simpler. Fewer hardware and software components are required to operate it.

Disadvantages

The main disadvantages of an air-gapped network are reduced operability and convenience. Software updates in particular can be very difficult. Some security professionals have argued the inability to keep software up to date is not worth the security of an air-gapped network. As mentioned, getting software or data in and out of the network is intentionally difficult. Coordinating multiple updates and figuring out who has changed what can be a challenge.

Where are Air Gapped Networks Used?

Military/governmental systems

In my professional experience, government and military are the largest users of air-gapped networks. The government runs many systems of national security interest. It is common for these systems to be physically inaccessible from other networks.

Military Network

Financial systems

Financial systems are often on air-gapped networks. If the financial system is not highly transactional and contains sensitive information, the enhanced security provided by an air-gapped network can be useful. It is also common to have an air-gapped backup of important financial system data for disaster recovery.

Financial Systems

Industrial Control systems

Industrial supervisory control and data acquisition (SCADA) systems are often air-gapped. These systems control large, potentially dangerous manufacturing equipment. For instance, the computer systems that control and oil and gas refinery would be air-gapped for safety. Industrial systems may not generally require internet access because the industrial setting is a closed environment.

Industrial Controls

Lottery machines

National and state lottery machines or random number generators are often required to be completely isolated from networks to prevent lottery fraud.

Lottery Systems

Life-critical systems

Systems where malfunction can cause loss of life are often air-gapped. Such systems include controls of nuclear power plants, air traffic control systems, and computerized medical equipment.

Nuclear Plant

Very simple systems

Finally, very simple systems like a home thermostat or washing machine are often air-gapped. This is not done for security. It is a byproduct of design simplicity and cost.

Getting Data In and Out

Getting data in and out of air-gapped networks is challenging by design. For this reason, there are a number of tools and technologies to help get data in and out of air-gapped networks securely. Some examples are data diodes like Owl Defense, thin clients like Forcepoint, and secure file transfer like 4sft. These technologies are military focused, very specific, and closed source.

We have a better way!

We here at DoltHub created the world's first version controlled database, called Dolt. Think Git and MySQL had a baby. A number of potential customers have inquired about how to use Dolt to move data in or out of air-gapped networks. After discussing this use case with these potential customers, we think the Git model can be especially useful for transferring data in and out of air-gapped networks.

Git

Git

Git can quickly compute the differences between two sets of files. If the contents of the directory change, Git can quickly compute the differences (ie. diff) between the old and new copies. This diff functionality is incredibly useful when transferring data into or out of an air-gapped network. You have a view of what the file structure you want should look like outside of the air-gapped network. Once you get inside the air-gapped network, you compare what you have to what is there. If you are satisfied with the result, you transfer the files. If not, you back out the change. This is especially useful for software configuration files.

Git has other useful functionality that can be leveraged in air-gapped network data transfer. Git is a single open source program so it should be easy to get security approved on a air-gapped system. Git produces immutable hashes called commits that can be used as a way to summarize the contents of a directory. Git has branches and merges so multiple versions of the system can be running synchronously.

Let's look at an example. First, I'll make the directory I want to version in Git and initialize a Git repository.

$ mkdir airgap
$ cd airgap 
$ git init
Initialized empty Git repository in /Users/timsehn/dolthub/git/airgap/.git/

Then, I'll make a simple bash script to generate 100 random files.

$ cat generate.sh  
#!/bin/bash

set -e

echo "Generating test folders"

mkdir -p ./parent_{0..9}/child_{0..9}
for file in ./parent_{0..9}/child_{0..9}/test.txt; do
    head -c 100 /dev/urandom > $file
done

I run the script and commit all the output to Git.

$ bash ./generate.sh
Generating test folders
$ git add .
$ git commit -am "Added a bunch of files"
[main (root-commit) 2697508] Added a bunch of files
 101 files changed, 95 insertions(+)
 create mode 100644 generate.sh
 create mode 100644 parent_0/child_0/test.txt
 create mode 100644 parent_0/child_1/test.txt
 create mode 100644 parent_0/child_2/test.txt
 ...
 ...
 create mode 100644 parent_9/child_7/test.txt
 create mode 100644 parent_9/child_8/test.txt
 create mode 100644 parent_9/child_9/test.txt

Finally, I pick a random file and modify it so Git can tell me what has changed. This simulates a new file being transferred across the air gap.

$ echo "Look what I did > parent_8/child_4/test.txt"
Look what I did > parent_8/child_4/test.txt
$ echo "Look what I did" > parent_8/child_4/test.txt
$ git diff
diff --git a/parent_8/child_4/test.txt b/parent_8/child_4/test.txt
index ad698ad..54c28e0 100644
--- a/parent_8/child_4/test.txt
+++ b/parent_8/child_4/test.txt
@@ -1,3 +1 @@
-^Gޙ
-<D5>^Ma<D3>~<87>^\u<98>H<84>~<F1><DE>^^<85>^O<FD><9B>
-<8A>Ǐ^Y<F0>R<88><82><D1>9^L^F<D0>+a<C2><F2>^T<80><BE>~H+׏U<A8>גʃ%^V^P6<80><E1><<F3>ɰ}^_$<F0><A7>^T<FF><D2><C8>A]<D3>^C<E4><87><F9><F9><80>SjTk^^WW<CC>Y<AD>^\׎^^B<A1><ED>
\ No newline at end of file
+Look what I did

See how easy it is to find what's changed in a bunch of files. Imagine the same process when bringing in new configuration for software in an air-gapped network.

Dolt

Dolt

Dolt brings Git functionality to SQL database tables instead of files. Dolt allows you to find differences in large sets of structured data instead of files. Traditionally, structured data has been difficult to compare across an air gap but Dolt fixes this issue.

Let's look at an example. First, I'll make the directory I want to store my Dolt database in and initialize it.

$ mkdir airgap
$ cd airgap 
$ dolt init  
Successfully initialized dolt data repository.

Then, I'll create a table and seed it with 10,000 rows.

$ dolt sql -q "create table airgap (
    id int primary key auto_increment, 
    random_text varchar(100))"

To make a random string I use this funky sql.

$ dolt sql -q "insert into airgap(random_text) select left(md5(rand()), 30)" 
Query OK, 1 row affected (0.00 sec)
$ dolt sql -q "select * from airgap"
+----+--------------------------------+
| id | random_text                    |
+----+--------------------------------+
| 1  | 224cca58ce43d912bcad83c865436f |
+----+--------------------------------+

And I repeat it 10,000 times.

$ for i in {1..10000}
for> do
for> dolt sql -q "insert into airgap(random_text) select left(md5(rand()), 30)"
for> done
Query OK, 1 row affected (0.00 sec)
Query OK, 1 row affected (0.00 sec)
Query OK, 1 row affected (0.00 sec)
...
...
...

Now, I make a Dolt commit so I can refer back to this point later.

$ dolt add .
$ dolt commit -am "Created table and seeded it with random data"

I randomly change one of the strings simulating new data coming across the airgap. This could be generated from a CSV load, a script, or any other database update method.

$ dolt sql -q 'update airgap set random_text="Look what I did" where id=ceiling(rand()*10000)'
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
$ dolt diff
diff --dolt a/airgap b/airgap
--- a/airgap @ q19i1spcs4d60de2rept4vu5rm9p4mkk
+++ b/airgap @ s0qb4ktjlctppapedls61svl8ejgah0e
+---+------+--------------------------------+
|   | id   | random_text                    |
+---+------+--------------------------------+
| < | 7292 | 3b62156b16389339344a7882fee318 |
| > | 7292 | Look what I did                |
+---+------+--------------------------------+

Dolt quickly and easily tells me what row changed. Dolt has a custom storage engine so fast diff scales to hundreds of millions of rows.

Conclusion

Air-gapped network are secure but hard to maintain, As you can see, Git and Dolt provide a great tool to compare data in files or tables when you bing it into an air-gapped network. Inspired? Come by our Discord for help getting started.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.