1. REFERENCE
    11 min read

    So you want Database Versioning?

    Tim Sehn|

    Here at DoltHub, we've had a lot of success with our "So you want..." series of blog posts helping people find Dolt when they are looking for it. Dolt is a lot of things. Dolt is a version controlled database, a Git database, Git for data, data versi...

    Read More
0 matching articles.
  1. USE CASE
    6 min read

    Dolt Use Cases

    Dolt is Git for data. Instead of versioning files, Dolt versions tables. DoltHub is a place on the internet to share Dolt repositories. As far as we can tell, Dolt is the only database with branches. How would you use such a thing? One of the hard t...

    Read More
  2. DATASET
    15 min read

    Who's at Risk of COVID-19 in the US Congress?

    Overview In this blog post, we discuss an approach for simulating an outbreak of COVID-19 in the US Congress. This is a long technical article about data sets, epidemiology, and simulation. Feel free to jump straight to the results of the simulatio...

    Read More
  3. REFERENCEWEB
    9 min read

    How We Built DoltHub: Front-End Architecture

    In the previous article in this series, we took a deep look at the overall system architecture of DoltHub, the online data community powered by the Dolt version-controlled database. In this article, we'll zoom in on the front end and see how the code...

    Read More
  4. 5 min read

    Testing Dolt using Bats

    We adopted Bash Automated Testing System (Bats) to test the Dolt command-line. As of March 10, 2020 we are up to 473 tests, though 55 are skipped because they currently fail. The tests define desired behavior so we're constantly working to get skippe...

    Read More
  5. FEATURE RELEASESQL
    6 min read

    Querying Historical Data with AS OF Queries

    Dolt is Git for data. It's a SQL database that lets you branch, merge, and fork your data just like you would a Git repository. In previous blog posts we announced how you can use special system tables to query the history of your database. Today, we...

    Read More
  6. DATASET
    5 min read

    Novel Coronavirus Dataset in Dolt: A Case for Branches

    Here at DoltHub, we've been working on COVID-19 data since February 5, 2020. First, we started importing John Hopkins data and then we worked on assembling the largest open, regularly-updated set of case details from Singapore, Hong Kong and South Ko...

    Read More
  7. DATASET
    4 min read

    Scraping a JavaScript-enabled Website in 2020

    As part of our effort to track data related to the Novel Coronavirus (COVID-19), we wanted to scrape a JavaScript-enabled website on Coronavirus from Hong Kong. Moreover, you'll notice that the website from Hong Kong uses lazy loading based on scroll...

    Read More
  8. DATASET
    6 min read

    Novel Coronavirus Dataset in Dolt: Case Details

    On Saturday, February 29, this transpired in our company chat room: Tim/Brian Google Chat Snippet A project was born. We had time series data for confirmed cases, deaths, and recoveries segmented by location sourced from John Hopkins but we did no...

    Read More
  9. REFERENCEWEB
    6 min read

    How We Built DoltHub: Stack and Architecture

    In our introductory article for this series, we took a high-level look at the technology stack and architecture behind DoltHub, the online home for Dolt data repositories. In this article, we'll delve a little deeper and discuss how the pieces of the...

    Read More
  10. REFERENCE
    6 min read

    Optimizing Sorted Map Iteration

    In this blog post I want to give an introduction to some core concepts used to implement fast querying of databases. These techniques were implemented in Dolt and produced significant performance improvements. Database internals The B-Tree is a cor...

    Read More
  11. REFERENCE
    10 min read

    So You Want Git for Data?

    Dolt is Git for Data. Learn about the options for versioning data catalogs, data pipeline version tools, and version controlled databases. The Dolt database versions data and schema with full audit history, diffs, and rollbacks.

    Read More
  12. DATASET
    7 min read

    Visualizing Temperature Changes Over Time

    In the first part of this two part blog I covered NOAA's "Global Hourly Surface Data" dataset and how it is modeled in Dolt. Dolt is git for data, and for this dataset we model a day of observations as a single commit in the commit graph. In this se...

    Read More
  13. DATASET
    6 min read

    NOAA Global Hourly Surface Data

    The National Oceanic and Atmospheric Administration, NOAA, publishes weather measurements taken from stations around the world. It started in 1901 with a handful of stations, and there are more than 35,000 stations today. Most of these stations provi...

    Read More
  14. FEATURE RELEASE
    7 min read

    Announcing Saved Queries

    Dolt is Git for data. We built Dolt to help teams collaborate on data sets using the forking, branching, and merging workflows that Git popularized. These workflows are what enable software engineers to collaborate on source code, and they're what en...

    Read More
  15. 4 min read

    Copyrightable Material

    In our previous blog post we examined some freely available licensing tools for open data from Creative Commons. To briefly recap a license specifies the terms under which copyrightable material is made available for public access, sharply distinct f...

    Read More
  16. 3 min read

    Data Licensing

    Introduction Dolt is a data format. DoltHub is a collaboration platform for data stored in the Dolt format. When sharing copyrighted content the terms of that sharing are governed by a license. In this post we highlight some common licenses attached...

    Read More
  17. DATASET
    8 min read

    Novel Coronavirus Dataset in Dolt

    John Hopkins University Center for Systems Science and Engineering began collecting, tabulating, and publishing Novel Coronavirus (COVID-19) data on January 31, 2020. We started importing this dataset into Dolt on February 5, 2020. This blog will exp...

    Read More
  18. REFERENCEWEB
    5 min read

    How We Built DoltHub: Introduction

    Towards the end of last month, we launched a totally reworked and redesigned version of DoltHub, our web application for hosting and collaborating on Dolt repositories. Now that we've had a little while to iron the kinks out, it seems like a good tim...

    Read More
  19. 3 min read

    Dolt and DoltHub Documentation

    Background We are excited to announce the launch of our documentation site. The goal of Dolt and DoltHub is to enable developers and the data community with radically better data infrastructure. High quality documentation should empower users by all...

    Read More
  20. SQL
    10 min read

    Implementing indexed joins

    Happy Valentines Day from all of us at DoltHub! You are the reason we do what we do! It you. In honor of the holiday, we want to talk about how much we love making queries faster. We're going to examine how our SQL engine makes a query plan and exp...

    Read More
JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.