Pushing down filters to make queries faster

SQL
10 min read

Dolt is Git for Data, a SQL database you can branch, merge, clone, fork, sync, push and pull. Today we're excited to announce the release of a new optimization in the query planner: pushing down filters!

What's a pushdown?

Pushdown is a query optimization that moves predicates in the WHERE clause closer to the tables they refer to ("pushes them down to the tables") in order to reduce the number of rows that need to be examined. This is easiest to demonstrate with an example.

Let's say you have two tables that you want to join with a query, along with some conditions. We'll use states and cities for these examples.

pushdown> describe states;
+------------+-------------+------+-----+---------+-------+
| Field      | Type        | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+-------+
| id         | int         | NO   | PRI |         |       |
| name       | varchar(80) | YES  |     |         |       |
| code       | varchar(2)  | YES  |     |         |       |
| population | int         | YES  |     |         |       |
+------------+-------------+------+-----+---------+-------+
pushdown> describe cities;
+------------+--------------+------+-----+---------+-------+
| Field      | Type         | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+-------+
| id         | int          | NO   | PRI |         |       |
| name       | varchar(100) | YES  |     |         |       |
| state_id   | int          | YES  | MUL |         |       |
| population | int          | YES  |     |         |       |
+------------+--------------+------+-----+---------+-------+

For this example, we'll fill them with data from just a few states and cities:

pushdown> SELECT * FROM states;
+----+------------+------+------------+
| id | name       | code | population |
+----+------------+------+------------+
| 1  | Alabama    | AL   | 4903000    |
| 2  | Alaska     | AK   | 731545     |
| 3  | Arizona    | AZ   | 7279000    |
| 4  | California | CA   | 39510000   |
| 5  | Colorado   | CO   | 5759000    |
+----+------------+------+------------+
pushdown> SELECT * FROM cities;
+----+-----------------+----------+------------+
| id | name            | state_id | population |
+----+-----------------+----------+------------+
| 1  | Alabaster       | 1        | 30352      |
| 2  | Auburn          | 1        | 53380      |
| 3  | Birmingham      | 1        | 212237     |
| 4  | Anchorage       | 2        | 291826     |
| 5  | Akutan          | 2        | 1027       |
| 6  | Bethel          | 2        | 6080       |
| 7  | Avondale        | 3        | 76238      |
| 8  | Apache Junction | 3        | 35840      |
| 9  | Buckeye         | 3        | 50876      |
| 10 | Alturas         | 4        | 2827       |
| 11 | Anaheim         | 4        | 336265     |
| 12 | Bakersfield     | 4        | 347483     |
| 13 | Aurora          | 5        | 353108     |
| 14 | Aspen           | 5        | 6805       |
| 15 | Boulder         | 5        | 105112     |
+----+-----------------+----------+------------+

Now let's say you want to join these two tables to find all the cities that start with 'A' that are in states that start with 'A'. You would write a query like this:

SELECT c.name AS city_name, s.name AS state_name
    FROM cities c
    JOIN states s ON c.state_id = s.id
    WHERE c.name LIKE 'A%' AND s.name LIKE 'A%';

And you would get output like this:

+-----------------+------------+
| city_name       | state_name |
+-----------------+------------+
| Alabaster       | Alabama    |
| Auburn          | Alabama    |
| Anchorage       | Alaska     |
| Akutan          | Alaska     |
| Avondale        | Arizona    |
| Apache Junction | Arizona    |
+-----------------+------------+

So far so good! But let's consider the number of rows that our query has to examine to produce the answer. We can examine the query plan the engine chose using the EXPLAIN keyword in front of the query. A naive query planner will produce something like this:

pushdown> EXPLAIN SELECT c.name AS city_name, s.name AS state_name
    FROM cities c
    JOIN states s ON c.state_id = s.id
    WHERE c.name LIKE 'A%' AND s.name LIKE 'A%';
+----------------------------------------------------+
| plan                                               |
+----------------------------------------------------+
| Project(c.name as city_name, s.name as state_name) |
|  └─ Filter(c.name LIKE "A%" AND s.name LIKE "A%")  |
|      └─ IndexedJoin(c.state_id = s.id)             |
|          ├─ TableAlias(c)                          |
|          │   └─ Table(cities)                      |
|          └─ TableAlias(s)                          |
|              └─ Table(states)                      |
+----------------------------------------------------+

So for this plan, we're examining every city in the cities table, looking up its state (via an index, great!). Then we take this list, the entire list of all the cities in the database joined to their states, and filter it down to just those that start with the letter A. There are about 20,000 cities and towns in the US, so this means we'll be reading 20K rows and then doing another 20K index lookups into the states table. That's not terrible, but we can do better.

Pushdown makes the filter happen before the join for each table. Here's the same query with pushdown analysis enabled:

+----------------------------------------------------+
| plan                                               |
+----------------------------------------------------+
| Project(c.name as city_name, s.name as state_name) |
|  └─ IndexedJoin(c.state_id = s.id)                 |
|      ├─ Filter(c.name LIKE "A%")                   |
|      │   └─ TableAlias(c)                          |
|      │       └─ Table(cities)                      |
|      └─ Filter(s.name LIKE "A%")                   |
|          └─ TableAlias(s)                          |
|              └─ Table(states)                      |
+----------------------------------------------------+

As you can see, the predicates in the filter node were pushed down below the join node, to right above their respective tables. Since all predicates were sucessfully pushed down, the entire wrapping filter can be safely discarded.

This optimization has a number of desirable properties:

  1. We do fewer lookups into the secondary table in the join. Instead of doing a lookup for every city in the table, we only do a lookup for the ones that start with 'A'.
  2. We don't assemble a join row result if the predicate on the secondary table doesn't match. Without pushdown, we always assemble a join row, comprising the concatenation of the rows from each of the tables, for every row in the primary table. For the example above, this means we never assemble the thousands of join rows for cities that start with 'A' in states that start with another letter. These join rows which will never appear in the final result set is all memory that must be allocated and later collected, which slows down the server.
  3. We can more efficiently execute in parallel on the two sides of the join. It isn't shown in the above plan, but Dolt is capable of running independent parts of the query in parallel to get faster results. The further down the plan we push this parallelization, the faster it makes the engine.

So originally, we were examining about 40K rows (20K for the cities table, then 20K index lookups into the states table). After pushdown, we've reduced this by a lot, specifically by the number of cities that don't begin with A. Let's make up a number: say that 10% of cities start with A. In that case, we've reduced the number of index lookups by 90%, but still have to scan all 20K cities rows. So 20K + 2K = 22K rows, a savings of 45%.

If we didn't have indexes to use for the join, we would get even bigger savings. Without an index we're reduced to a cross join, which means we're examining 20K _ 50 rows, 1M rows total. Pushdown reduces this to (20k + 50) + (2K _ 4), so 28,050 rows total: a savings of 97.2%. (To truly do only 4 lookups on the secondary table we need to load it entirely into memory during join analysis, which the engine will do for a table this small).

That's a great start. Can we do better?

Pushing down indexes

If we want this query to be faster, we'll need an index on it. So let's create a new column for the first letter of the names of cities and states, and populate it:

pushdown> alter table cities add column first_letter char(1);
pushdown> alter table states add column first_letter char(1);
pushdown> update cities set first_letter = substr(name, 1, 1);
Query OK, 15 rows affected
Rows matched: 15  Changed: 15  Warnings: 0
pushdown> select * from cities;
+----+-----------------+----------+------------+--------------+
| id | name            | state_id | population | first_letter |
+----+-----------------+----------+------------+--------------+
| 1  | Alabaster       | 1        | 30352      | A            |
| 2  | Auburn          | 1        | 53380      | A            |
| 3  | Birmingham      | 1        | 212237     | B            |
| 4  | Anchorage       | 2        | 291826     | A            |
| 5  | Akutan          | 2        | 1027       | A            |
| 6  | Bethel          | 2        | 6080       | B            |
| 7  | Avondale        | 3        | 76238      | A            |
| 8  | Apache Junction | 3        | 35840      | A            |
| 9  | Buckeye         | 3        | 50876      | B            |
| 10 | Alturas         | 4        | 2827       | A            |
| 11 | Anaheim         | 4        | 336265     | A            |
| 12 | Bakersfield     | 4        | 347483     | B            |
| 13 | Aurora          | 5        | 353108     | A            |
| 14 | Aspen           | 5        | 6805       | A            |
| 15 | Boulder         | 5        | 105112     | B            |
+----+-----------------+----------+------------+--------------+
pushdown> update states set first_letter = substr(name, 1, 1);
Query OK, 5 rows affected
Rows matched: 5  Changed: 5  Warnings: 0
pushdown> select * from states;
+----+------------+------+------------+--------------+
| id | name       | code | population | first_letter |
+----+------------+------+------------+--------------+
| 1  | Alabama    | AL   | 4903000    | A            |
| 2  | Alaska     | AK   | 731545     | A            |
| 3  | Arizona    | AZ   | 7279000    | A            |
| 4  | California | CA   | 39510000   | C            |
| 5  | Colorado   | CO   | 5759000    | C            |
+----+------------+------+------------+--------------+
pushdown> create index cfl on cities (first_letter);
pushdown> create index sfl on states (first_letter);

Now that we've indexed the main expression we care about, let's EXPLAIN the query plan again.

pushdown> EXPLAIN SELECT c.name AS city_name, s.name AS state_name
    FROM cities c
    JOIN states s ON c.state_id = s.id
    WHERE c.first_letter = 'A' AND s.first_letter =  'A';
+---------------------------------------------------------------------+
| plan                                                                |
+---------------------------------------------------------------------+
| Project(c.name AS city_name, s.name AS state_name)                  |
|  └─ IndexedJoin(c.state_id = s.id)                                  |
|      ├─ Filter(c.first_letter = "A")                                |
|      │   └─ TableAlias(c)                                           |
|      │       └─ Indexed table access on index [cities.first_letter] |
|      │           └─ Table(cities)                                   |
|      └─ Filter(s.first_letter = "A")                                |
|          └─ TableAlias(s)                                           |
|              └─ Table(states)                                       |
+---------------------------------------------------------------------+

This is even better! Now we're accessing the cities table using our index, which means we don't need to do a full table scan on it. That means that we now only need to examine 2K rows in cities, plus another 2K lookups into states. That's 4K rows examined total, down from 40K originally, or a savings of 90%.

Caveats: you can't always push down predicates

Careful readers might have noticed that only the predicate on cities was pushed down, not the one on states. This is true! The reason we can't push down an index onto the states table is that we're already using an index to access it, and Dolt can't currently combine multiple index lookups on the same table. And even if we could, it would be unlikely to help in this case, since the two indexes don't share any columns. You could imagine creating a two-column index on (id, first_letter) and using that... but that's an optimization for another day.

And there are other caveats as well: we can't push down to the right-hand table in a LEFT JOIN or the left-hand table in a RIGHT JOIN. In those kinds of joins, you always include every row from the LEFT or RIGHT table in the result, even if there is no matching row in the other table. When this happens, the secondary table has NULL for all of its columns in the join result. You can only evaluate certain conditions after these semantics have been applied, not before, to get correct results. Right now we are conservative and avoid pushing down any predicates to the secondary table in one of these joins.

There are lots of tricky subtleties to this operation. We wrote a lot of tests for it, and still found a large batch of bugs after the first iteration. You can't ever really have enough tests when you're writing a SQL query engine.

Implementing the analysis

Dolt's SQL engine is go-mysql-server, an open-source SQL engine we adopted earlier this year. Most of the interesting parts of it take place in a chunk of code called the Analyzer, which is responsible for producing correct and fast query execution plans from a basic parse tree. You can read more about how the Analyzer works in our blog post announcing subquery support.

Pushdown is implemented as a series of Analyzer functions. We do three passes over the tree, looking for Filter nodes that can be pushed down to their tables:

  1. Apply indexes to any tables that support them. This is obviously the fastest and most direct way to access the subset of data in the table needed by a query.
  2. Use the sql.FilteredTable interface to get the table to filter itself. This is an interface that allows tables to filter themselves through whatever mechanism they choose. Dolt tables don't support this, except for a couple system tables where it's really important.
  3. Move the Filter node down to right above the table. This is basically always possible if the first two passes aren't, and can often lead to big query savings.

After pushing down all possible Filter predicates, we remove all the ones we can from the top-level Filter node. If it's empty after this, we discard it entirely.

The actual details of these transformations are difficult and subtle, and you can read them yourself if you're interested. For this blog, I only want to call your attention to a new Transform method that I had to write to succinctly represent pushdown semantics. Here's the method that attempts to push down indexes onto tables in an execution tree:

// convertFiltersToIndexedAccess attempts to replace filter predicates with indexed accesses where possible
func convertFiltersToIndexedAccess(a *Analyzer, n sql.Node, scope *Scope, indexes indexLookupsByTable) (sql.Node, error) {
    childSelector := func(parent sql.Node, child sql.Node, childNum int) bool {
        switch parent.(type) {
        // For IndexedJoins, we already are using indexed access during query execution for the secondary table, so
        // replacing the secondary table with an indexed lookup will have no effect on the result of the join, but *will*
        // inappropriately remove the filter from the predicate.
        // TODO: the analyzer should combine these indexed lookups better
        case *plan.IndexedJoin:
            return childNum == 0
        // Left and right joins can push down indexes for the primary table, but not the secondary. See comment
        // on transformPushdownFilters
        case *plan.LeftJoin:
            return childNum == 0
        case *plan.RightJoin:
            return childNum == 1
        }
        return true
    }

    node, err := plan.TransformUpWithSelector(n, childSelector, func(node sql.Node) (sql.Node, error) {
        switch node := node.(type) {
        // TODO: some indexes, once pushed down, can be safely removed from the filter. But not all of them, as currently
        //  implemented -- some indexes return more values than strictly match.
        case *plan.TableAlias:
            table, err := pushdownIndexesToTable(a, node, indexes)
            if err != nil {
                return nil, err
            }
            return FixFieldIndexesForExpressions(table, scope)
        case *plan.ResolvedTable:
            table, err := pushdownIndexesToTable(a, node, indexes)
            if err != nil {
                return nil, err
            }
            return FixFieldIndexesForExpressions(table, scope)
        default:
            return FixFieldIndexesForExpressions(node, scope)
        }
    })

    if err != nil {
        return nil, err
    }

    return node, nil
}

Unlike other Transformation methods, this new variant allows the caller to pass a child selector function that lets them choose which parts of the query plan to analyze. We use this functionality to avoid pushing indexes down to parts of the query plan where they might yield incorrect results, while still pushing them down other places. This is how we apply indexed accesses to the left side of a LEFT JOIN but not the right, for example.

As part of this work, I realized that we really needed a better way to visualize index usage when examining query plans. To this end, I created a new kind of Node just to display these kind of performance notes, a DecoratedNode. Here we are applying one to a table during pushdown of an index so that we can see this in the EXPLAIN output:

var newTableNode sql.Node = tableNode

replacedTable := false
if it, ok := table.(sql.IndexAddressableTable); ok {
    indexLookup, ok := indexes[tableNode.Name()]
    if ok {
        table = it.WithIndexLookup(indexLookup.lookup)
        indexStrs := formatIndexDecoratorString(indexLookup.indexes...)

        indexNoun := "index"
        if len(indexStrs) > 1 {
            indexNoun = "indexes"
        }
        newTableNode = plan.NewDecoratedNode(
            fmt.Sprintf("Indexed table access on %s %s", indexNoun, strings.Join(indexStrs, ", ")),
            newTableNode)
        a.Log("table %q transformed with pushdown of index", tableNode.Name())

        replacedTable = true
    }
}

Conclusion

To run the example queries in this blog article yourself, you can clone the sample dataset from DoltHub here. You can also run the queries in this article right in your browser. Give it a try!

Dolt's's query optimizer is still in its infancy. There is a long road ahead of us to become as fast as our customers need us to be, but we are making steady progress. The pushdown optimization will make a lot of queries much faster, and we're excited to get one step closer in performance to other databases.

If you haven't tried Dolt yet, give it shot by installing Dolt or chat with us on Discord. We're always happy to hear from new customers!

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.