Understanding Tags in Go

8 min read

In Go, tags allow developers to attach metadata to struct fields. These tags can drive features and behaviors in various libraries and tools which access the tags via reflection. This article provides an overview of tags in Go, including their syntax, common use cases, and best practices. Then we will dive into how Hosted Dolt uses them to drive serialization of server configuration that is appropriate for the version of Dolt being run.

Anatomy of Tags

Go tags are attached to struct fields. The common convention is to structure your tags as key value pairs. A field may have more than one tag, and the value of a tag is a string. Tags are enclosed in backticks and are placed immediately after the field name.

type Example struct {
	Field1 int    `tag1:"value1" tag2:"value2"`
    Field2 string `tag1:"value3,value4"`
}

Here, Field1 has two tags, tag1 and tag2, with values "value1" and "value2" respectively. Field2 has a single tag, tag1, with a value of "value3,value4".

Common Use Cases

Typically, a tag's name will correspond with a single feature or behavior and often each field in the struct will have a tag with the same name. As an example, json, yaml, and db are common tag names used to specify how a field should be handled by a JSON encoder, YAML encoder, or database ORM respectively.

Serialization and Deserialization

One of the most common use cases for tags in Go is in serialization and deserialization tasks. Libraries like encoding/json, encoding/xml, and encoding/yaml use tags to map struct fields to data fields during encoding and decoding operations.

For example, consider a struct representing a person:

type Person struct {
    FirstName    string     `json:"first_name"`
    LastName     string     `json:"last_name"`
    DOB          *time.Time `json:"dob,omitempty"`
    Address      *string    `json:",omitempty"`
    Weight       int        `json:"-"`
    HeightInches int
}

In this example, the json tags specify how the exported struct fields should be serialized and deserialized. The specifics of how the values of the json tags are interpreted are defined by the encoding/json package (See here for details). In this case the value of the tag is a comma separated list. The first list value is the name of the field when serialized as a JSON object. A value of - for the name tells the serializer not to include this field in the serialized output. A second value of omitempty option tells the JSON encoder to omit the field if it is nil. If any field does not specify the field name for the JSON object, then a default name is used which is the field name in the struct. Additional formatting options can be specified in the tag value.

Here is an example serializing an instance of the Person struct and outputting its value.

func main() {
	p := Person{
		FirstName:    "Peter",
		LastName:     "Griffin",
		Weight:       298,
		HeightInches: 67,
	}
	data, err := json.Marshal(p)
	if err != nil {
		panic(err)
	}
	fmt.Println(string(data))
}

Which will output:

{
  "first_name": "Peter",
  "last_name": "Griffin",
  "HeightInches": 67
}

You can play with this more here. The details covered here are specifically for json serialization. Please see the documentation of the specific package you are using for details of supported tag features. (As an example, here is the documentation of the yaml tag format used by gopkg.in/yaml.v2).

Database Mapping

ORMs (Object-Relational Mappers) like Gorm, dbr and SQLx use tags to automatically generate SQL queries and map database rows to struct fields.

type Product struct {
    ID       uint   `gorm:"primary_key"`
    Name     string `gorm:"size:255"`
    Price    float64
    Quantity int
}

In this example, the gorm tags specify that the ID field is the primary key, the Name field should be mapped to a column with a size of 255 characters, and the other fields should be mapped based on their names and types.

Validation

Tags can also be used for data validation. Validation libraries like validator can interpret tags to enforce validation rules on struct fields. For instance, you can specify constraints such as minimum and maximum values, required fields, and regular expressions using tags.

type User struct {
    Username string `validate:"required,min=5,max=20"`
    Email    string `validate:"required,email"`
    Age      int    `validate:"gte=18"`
}

Here, the validate tag defines rules such as the Username being required and having a length between 5 and 20 characters, the Email being required and in a valid email format, and the Age being greater than or equal to 18. We can then create a new instance of the validator and call Struct to validate the struct.

func main() {
	user := User{
		Username: "peter",
		Email:    "peter@fake.horse",
		Age:      17,
	}

	err := validator.New().Struct(user)
	if err != nil {
		fmt.Println("validation failed:", err)
	} else {
		fmt.Println("validation passed")
	}
}

Which will output:

validation failed: Key: 'User.Age' Error:Field validation for 'Age' failed on the 'gte' tag

You can play with this more here.

A Problem solved with Custom Tags

Dolt uses a yaml configuration file to configure the dolt sql-server command. The parser for this configuration file uses strict parsing rules to ensure that the configuration is valid, and that if a field is included it must be known by Dolt, or fail to start. This is done to ensure that every item an end user puts in the configuration file is used and a user doesn't get different behavior than expected based on their configuration. This can happen if the version of Dolt being run is older than the documentation being used to configure it. However, strict parsing caused issues for Hosted Dolt.

Hosted Dolt is a service that runs Dolt for users. Each Dolt cluster is deployed and updated independently. This means that across the Hosted Dolt fleet there can be many different versions of Dolt running, and the features and configuration options available can vary between them. When a user makes a configuration change on the Hosted Dolt website, the configuration file needs to be updated using the serialization format that is appropriate for the version of Dolt that is running.

The application that writes out the configuration file imports the github.com/dolthub/dolt/go/cmd/dolt/commands/sqlserver package and uses the SerializeConfigForVersion function to serialize the configuration. This function takes a sqlserver.YamlConfig and a version number and returns a string that is the serialized configuration.

Implementing SerializeConfigForVersion

SerializeForConfigVersion uses the package gopkg.in/yaml.v2 to serialize the configuration as yaml. As we saw previously mapping between go and our serialized format is done via tags. Here the sqlserver.YamlConfig struct is annotated with yaml tags like so:

// YAMLConfig is a ServerConfig implementation which is read from a yaml file
type YAMLConfig struct {
	LogLevelStr       *string               `yaml:"log_level,omitempty"`
	MaxQueryLenInLogs *int                  `yaml:"max_logged_query_len,omitempty"`
	EncodeLoggedQuery *bool                 `yaml:"encode_logged_query,omitempty"`
	BehaviorConfig    BehaviorYAMLConfig    `yaml:"behavior"`
    ...

    SystemVars_     *engine.SystemVariables `yaml:"system_variables,omitempty" minver:"1.11.1"`

In order to support serialization for specific versions we add a minver tag to new struct fields. This tag specifies the minimum version of Dolt that supports the field. This allows us to add new fields to the configuration struct and have them serialized only when the version of Dolt being run is new enough to support them. During development new fields are added with a minver of TBD and then updated to the correct version when the feature is released. If a field does not have a minver tag, then it is assumed to be supported in all versions.

In order to take the sqlselver.YAMLConfig instance and serialize it for a specific version we need to null out any fields that are not supported, and then serialize the struct (This requires that the field also has a valid yaml tag with "omitempty" in the value).

func nullUnsupported(verNum uint32, st any) error {
	const tagName = "minver"

	// use reflection to loop over all fields in the struct st
	// for each field check the tag "minver" and if the current version is less than that, set the field to nil
	t := reflect.TypeOf(st)

	if t.Kind() == reflect.Ptr {
		t = t.Elem()
	}

	// Iterate over all available fields and read the tag value
	for i := 0; i < t.NumField(); i++ {
		// Get the field, returns https://golang.org/pkg/reflect/#StructField
		field := t.Field(i)

		// Get the field tag value
		tag := field.Tag.Get(tagName)

Here, our nullUnsupported function uses reflection to loop over all fields in the struct and get the minver tag value for each struct field.

	    if tag != "" {
			// if it's nullable check to see if it should be set to nil
			if field.Type.Kind() == reflect.Ptr || field.Type.Kind() == reflect.Slice || field.Type.Kind() == reflect.Map {
				var setToNull bool

				if tag == "TBD" {
					setToNull = true
				} else {
					minver, err := version.Encode(tag)
					if err != nil {
						return fmt.Errorf("invalid version tag '%s' on field '%s': %w", tag, field.Name, err)
					}

					setToNull = verNum < minver
				}

				if setToNull {
					// Get the field value
					v := reflect.ValueOf(st).Elem().Field(i)
					v.Set(reflect.Zero(v.Type()))
				}
			} else {
				return fmt.Errorf("non-nullable field '%s' has a version tag '%s'", field.Name, tag)
			}

			var hasOmitEmpty bool
			yamlTag := field.Tag.Get("yaml")
			if yamlTag != "" {
				vals := strings.Split(yamlTag, ",")
				for _, val := range vals {
					if val == "omitempty" {
						hasOmitEmpty = true
						break
					}
				}
			}

			if !hasOmitEmpty {
				return fmt.Errorf("field '%s' has a version tag '%s' but no yaml tag with omitempty", field.Name, tag)
			}
		}

If the value of our minver tag is not empty we check to see if the field is nullable. If it is, we check to see if the field should be set to nil based on the version of Dolt being run (Or if the minver is TBD). If it is nullable and should be set to nil we use reflection to update the value. The rest of the code here validates the tags are applied to the fields properly, and if they are not it returns an error.

After checking if a field should be set to nil, we check to see if we need to recurse into the field and perform the same operation. This is done for fields that are pointers to structs, structs, and slices of structs.

        v := reflect.ValueOf(st).Elem().Field(i)

		vIsNullable := v.Type().Kind() == reflect.Ptr || v.Type().Kind() == reflect.Slice || v.Type().Kind() == reflect.Map

		if !vIsNullable || !v.IsNil() {
			// if the field is a pointer to a struct, or a struct, or a slice recurse
			if field.Type.Kind() == reflect.Ptr && field.Type.Elem().Kind() == reflect.Struct {
				err := nullUnsupported(verNum, v.Interface())
				if err != nil {
					return err
				}
			} else if field.Type.Kind() == reflect.Struct {
				err := nullUnsupported(verNum, v.Addr().Interface())
				if err != nil {
					return err
				}
			} else if field.Type.Kind() == reflect.Slice {
				if field.Type.Elem().Kind() == reflect.Ptr && field.Type.Elem().Elem().Kind() == reflect.Struct {
					for i := 0; i < v.Len(); i++ {
						err := nullUnsupported(verNum, v.Index(i).Interface())
						if err != nil {
							return err
						}
					}
				} else if field.Type.Elem().Kind() == reflect.Struct {
					for i := 0; i < v.Len(); i++ {
						err := nullUnsupported(verNum, v.Index(i).Addr().Interface())
						if err != nil {
							return err
						}
					}
				}
			}
		}

The full code can be found here.

Conclusion

Go tags are a powerful feature that can be used to drive features and behaviors in various libraries and tools. Embedding metadata in your code which can be accessed through reflection can be a powerful tool for building flexible and extensible systems. In the case of Hosted Dolt it allowed us to add new fields to our configuration struct and serialize them only when the version of Dolt being run was new enough to support them. This allows us to add new features to Dolt without breaking existing deployments.

At DoltHub Inc our primary development language is Go. Dolt is written in Go, as are many of our tools and backend systems. If you are interested in talking about Go, or contributing to our open source Go projects, please reach out to us on Discord. We'd love to hear from you.

SHARE

JOIN THE DATA EVOLUTION

Get started with Dolt

Or join our mailing list to get product updates.