A plugin architecture using golang interface extension

7 min read

We're building Dolt, the world's first version controlled SQL database. Dolt's SQL engine is go-mysql-server, which started its life as an independent open source project. Although our primary development effort on go-mysql-server is to support Dolt, it remains a generic SQL execution engine that you can use to run SQL queries on any data store by implementing a handful of golang interfaces.

In this blog post, we'll discuss a design pattern that makes it possible for integrators with go-mysql-server to implement a small subset of the interfaces and still get a lot of value out of the engine.

Lots of interfaces

To start with, why do we care how many interfaces integrators need to deal with? There are two answers to this question.

First, by making go-mysql-server easy to use and useful to as many different projects as possible, we hope to attract better bug reports and more open source contributions. This makes the engine better, which makes Dolt better.

Second, there are a lot of interfaces defined by the engine.

% grep interface core.go | wc -l
103

To be fair, some of these are used internally only, but most of them are ones we expect engine integrators to implement. To start, everyone integrating with the engine needs to define a DatabaseProvider which provides Databases, a Database which provides Tables, and Table which provides RowIters. These interfaces look like this:

type DatabaseProvider interface {
	Database(ctx *Context, name string) (Database, error)
	HasDatabase(ctx *Context, name string) bool
	AllDatabases(ctx *Context) []Database
}

type Database interface {
	Nameable
	GetTableInsensitive(ctx *Context, tblName string) (Table, bool, error)
	GetTableNames(ctx *Context) ([]string, error)
}

type Table interface {
	Nameable
	String() string
	Schema() Schema
	Partitions(*Context) (PartitionIter, error)
	PartitionRows(*Context, Partition) (RowIter, error)
}

Implement these 3 interfaces with 11 total methods for your data source, and you can start querying it with SQL, including a wire-compatible server. Not a bad deal at all.

So where do the other 100 interfaces come into play?

A plug-in architecture using Golang interface extension

It's easy to get started writing a backend to the engine with the above interfaces, but it will have has very limited capabilities. Among the things it won't be able to do that a full featured backend can:

  • CREATE TABLE, ALTER TABLE, and other DDL statements
  • INSERT, UPDATE, DELETE
  • AUTO_INCREMENT columns
  • Indexed lookups or joins
  • FOREIGN KEY or CHECK constraints
  • Triggers
  • Views
  • Custom functions
  • Stored procedures

But don't worry! You don't have to bite off all this work in one go. All the functionality is unlocked in a modular fashion, in pretty small bites.

For example, if you want your tables to be able to accept INSERT statements, you can implement these 3 interfaces:

type InsertableTable interface {
	Table
	Inserter(*Context) RowInserter
}

type RowInserter interface {
	TableEditor
	Insert(*Context, Row) error
	Closer
}

type TableEditor interface {
	StatementBegin(ctx *Context)
	DiscardChanges(ctx *Context, errorEncountered error) error
	StatementComplete(ctx *Context) error
}

Note that this design is coherent: it's not possible to implement InsertableTable without also supplying an implementation for a RowInserter, which needs the methods from TableEditor.

Similarly, if you want to add index lookup capabilities to your tables, you implement this interface:

type IndexedTable interface {
	IndexAddressableTable
	GetIndexes(ctx *Context) ([]Index, error)
}

type IndexAddressableTable interface {
	Table
	IndexAddressable
}

type IndexAddressable interface {
	WithIndexLookup(IndexLookup) Table
}

Note that here you have two choices: you can declare a table that supports native indexes with IndexedTable, or you can defer to some external index provider and only implement IndexAddressableTable.

That's the pattern. If you want to unlock more functionality in the engine, simply implement additional interfaces. Each one is a small, manageable chunk of work so you can develop your backend iteratively.

Interface extension in Golang

How does this all work on the engine side?

Like many strongly typed languages, go allows you to assert that an object has a different runtime interface than its static declaration. Go actually provides two (or three) different methods to do this. For this example, we'll be checking if a sql.Table supports sql.InsertableTable.

You can make a naked type assertion, like this:

func getInsertableTable(t sql.Table) (sql.InsertableTable, error) {
	it := t.(sql.InsertableTable)
	return it, nil
}

This will panic if the assertion fails, which makes it unsuitable for this pattern. We want to allow graceful recovery in the case that integrator hasn't implemented the extension. So instead, we can use the second form of type assertion:

func getInsertableTable(t sql.Table) (sql.InsertableTable, error) {
	it, ok := t.(sql.InsertableTable)
	if !ok {
		return nil, ErrInsertIntoNotSupported.New()
	}
	return it, nil
}

Finally, for our use case there are actually many sub-interfaces possible that we need to handle, so we often write this logic as a type switch like so:

func getInsertableTable(t sql.Table) (sql.InsertableTable, error) {
	switch t := t.(type) {
	case sql.InsertableTable:
		return t, nil
	case sql.TableWrapper:
		return getInsertableTable(t.Underlying())
	default:
		return nil, ErrInsertIntoNotSupported.New()
	}
}

As you can see above, when you try to run an INSERT query on a table implementation that doesn't support it, your query will fail with the message table doesn't support INSERT INTO.

We use this pattern extensively throughout the SQL engine, which lets integrators provide as much functionality as they need to and no more.

Advantages of interface extension in a plug-in architecture

Astute readers are probably asking: why not just combine all the interfaces together and then let integrators throw errors for the unimplemented methods? Something like this:

type Table interface {
	Nameable
	String() string
	Schema() Schema
	Partitions(*Context) (PartitionIter, error)
	PartitionRows(*Context, Partition) (RowIter, error)
    // InsertableTable methods
	Inserter(*Context) RowInserter // just return nil if you don't support this
    // DeletableTable, UpdateableTable, etc. methods below...
}

The most concrete reason not to do things this way is that as you develop the framework over time, you'll constantly break any existing integrators, since they will no longer satisfy an interface when you add methods to it. It would also require us to provide some sort of "not implemented" semantics on all these methods (like a return parameter or a special error type), rather than letting the language's type system do this for us. As an open source project, we can't control who takes a dependency on us or enforce that they keep it up to date as we change it. We have made breaking changes in the past, but we try to do so very sparingly, definitely not every time we add a new feature to the engine. For that use case, we almost always define a new interface.

The other arguments for interface extension mostly boil down to aesthetics: it's gross to have to provide error implementations for so many methods. This isn't a major consideration for us (all of our code is gross), but people do care about this. A lot. It bothered Java developers enough that they got Oracle to introduce default method implementations for interfaces.

Interface extension in other languages

Golang isn't the only language where this pattern is possible -- far from it. Here's how it looks in Java:

public interface Table {
	public Schema schema();
    ...
}

public interface InsertableTable extends Table {
    public RowInserter inserter(Context);
}

...

try {
  InsertableTable it = (InsertableTabl) table;
} catch (ClassCastException e) {
  throw new UnsupportedException("InsertableTable");
}

In addition to statically typed language like Java and Go, this design pattern is even easier in dynamically typed languages like JavaScript and Python -- you just don't get any help from the compiler to enforce correctness.

Performance implications

Interface assertions in golang are relatively cheap but certainly not free. We were experimenting with a variation on this pattern when testing out a new row storage format to see how it would impact performance, and was surprised to see runtime.assertI2I was taking up a full 13% of our CPU time:

runtime.assertI2I

What's going on here? Basically, we were doing something like this:

type RowIter interface {
	Next(ctx *Context) (Row, error)
	Closer
}

type RowIter2 interface {
	RowIter
	Next2(ctx *Context, frame *RowFrame) error
}

...

func (i *TableRowIter) Next2(ctx *sql.Context, frame *sql.RowFrame) error {
    i2 := i.childIter.(sql.RowIter2)
    return i2.Next2(ctx, frame)
}

It's not easy to tell without understanding the structure of the overall row iteration code, but the call to assert sql.RowIter2 is effectively taking place inside a tight loop. For every row fetched, we were performing this interface conversion, which incredibly is more expensive than going to disk and deserializing a row (at least once the OS caches the page in memory).

The reason this call is expensive gets into the internals of the golang runtime. If you're interested, Planetscale did a great write-up of how the same performance penalty can apply to golang generics.

The bottom line: if you use this pattern, don't do it in a tight loop where performance is critical.

Conclusion

The interface extension design pattern is well suited for plugin architectures like the one used by go-mysql-server. It's a great way to make it easier for people to integrate with your project, and is flexible and easy to implement.

Questions? Comments? Thoughts about SQL engines or databases? Come talk to us on Discord.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.