Go: Using reflection to transmute data

I’ve been working for some time on this API Client library for a CDN at work, and integrating it as I go into a custom internal terraform provider. That’s a whole story for another day, but as a part of this it meant I was casting data from json to structs/models to terraform schema and back the other way again.

A lot.

I mean really a lot, do you know how many fields there are in a deeply configurable CDN?

I cast magic missile at the darkness

I started out dealing with this in a rather inefficient, frustrating way.


// OriginHostFromState ...
func (c *Configuration) OriginHostFromState(state map[string]interface{}) {
	c.OriginPullHost = &OriginPullHost{}
	if state["primary"] != nil {
		c.OriginPullHost.Primary = state["primary"].(int)
	}
	if state["secondary"] != nil {
		c.OriginPullHost.Secondary = state["secondary"].(int)
	}
	if state["path"] != nil {
		c.OriginPullHost.Path = state["path"].(string)
	}
}

I’d spent the last few months designing out my models with ETLs like this, extracting fields from the terraform schema by hand and casting them into the model. And the other direction wasn’t any better


// OriginHostFromModel ...
func (c *Configuration) OriginHostFromModel() []interface{} {
	originHostSliceIface := []interface{}{}
	originHostIface := make(map[string]interface{})

	if c.OriginPullHost != nil {
		originHostIface["primary"] = c.OriginPullHost.Primary
		originHostIface["secondary"] = c.OriginPullHost.Secondary
		originHostIface["path"] = c.OriginPullHost.Path
	} else {
		originHostIface["primary"] = nil
		originHostIface["secondary"] = nil
		originHostIface["path"] = nil
	}
	originHostSliceIface = append(originHostSliceIface, originHostIface)

	return originHostSliceIface
}

Now to be clear: This worked. I could have continued doing this, and functionally it would have been (mostly) fine. I developed up to now doing this, and the biggest bottleneck was in the sheer quantity of typing and copy paste to deal with all of these keys, across many API fields. The CDN API uses camel case, terraform schema demands snake case only. I had a dozen compress/uncompress methods already filled with these.

It felt bad.

Reflection to the rescue?

I was already aware that golang’s JSON package used reflection on structs, in combination with the json struct tags, to pack and unpack data. I didn’t (and still don’t) fully completely understand how they do what they do, being able to handle so many edge cases in such a stable way, but I knew something similar would be huge for this project.

I started out by defining my struct tags, spending an up front amount of labor to reap repeatable rewards.


// OriginPullPolicy encapsulates origin pull cache settings
type OriginPullPolicy struct {
	Enabled                        bool   `json:"enabled" tf:"enabled"`
	ExpirePolicy                   string `json:"expirePolicy" validate:"oneof=CACHE_CONTROL INGEST LAST_MODIFY NEVER_EXPIRE DO_NOT_CACHE" tf:"expire_policy"`
	ExpireSeconds                  *int   `json:"expireSeconds,omitempty" tf:"expire_seconds"`
	ForceBypassCache               bool   `json:"forceBypassCache,omitempty" tf:"force_bypass_cache"`
	HonorMustRevalidate            bool   `json:"honorMustRevalidate,omitempty" tf:"honor_must_revalidate"`
	HonorNoCache                   bool   `json:"honorNoCache,omitempty" tf:"honor_no_cache"`
	HonorNoStore                   bool   `json:"honorNoStore,omitempty" tf:"honor_no_store"`
	HonorPrivate                   bool   `json:"honorPrivate,omitempty" tf:"honor_private"`
	HonorSMaxAge                   bool   `json:"honorSMaxAge,omitempty" tf:"honor_smax_age"`
	HTTPHeaders                    string `json:"httpHeaders,omitempty" tf:"http_headers"` // string list
	MustRevalidateToNoCache        bool   `json:"mustRevalidateToNoCache,omitempty" tf:"must_revalidate_to_no_cache"`
	NoCacheBehavior                string `json:"noCacheBehavior,omitempty" tf:"no_cache_behavior"`
	UpdateHTTPHeadersOn304Response bool   `json:"updateHttpHeadersOn304Response,omitempty" tf:"update_http_headers_on_304_response"`
	DefaultCacheBehavior           string `json:"defaultCacheBehavior,omitempty" tf:"default_cache_behavior"` // Default behaviour when the policy is "Cache Control" and the "Cache-Control" header is missing. ttl & ...?
	MaxAgeZeroToNoCache            bool   `json:"maxAgeZeroToNoCache,omitempty" tf:"max_age_zero_to_no_cache"`
	BypassCacheIdentifier          string `json:"bypassCacheIdentifier,omitempty" tf:"bypass_cache_identifier"` // no-cache only
	ContentTypeFilter              string `json:"contentTypeFilter,omitempty" tf:"content_type_filter"`         // string list
	HeaderFilter                   string `json:"headerFilter,omitempty" tf:"header_filter"`                    // string list
	MethodFilter                   string `json:"methodFilter,omitempty" tf:"method_filter"`                    // string list
	PathFilter                     string `json:"pathFilter,omitempty" tf:"path_filter"`                        // string list
	StatusCodeMatch                string `json:"statusCodeMatch,omitempty" tf:"status_code_match"`             // string list
}

The first thing you may notice here is my int pointer. There is this little quirk to Golang where the null value of an int is 0. The json tag omitempty will leave it out, and go in general treats it as “unset”. 0, however, is a valid TTL and we don’t want it being pushed to the CDN API and overriding default values and all that so I used a pointer’d int so null becomes the “unset” value. This becomes important later.

From here I had to write a packer and unpacker. From struct to map[string]interface{} to inject in my terraform schema, and from map[string]interface{} to struct to send back to the API. I also had dozens of structs, so it had to be generic.


const terraformTag = "tf"

// MapFromStruct extracts a map[string]interface from an API model if it uses the tf struct tag
func MapFromStruct(s interface{}) map[string]interface{} {
	// make sure our interface isn't empty
	if s != nil {
		ret := make(map[string]interface{})

		reflection := reflect.ValueOf(s).Elem()

		// iterate the fields in the interface (struct)
		for i := 0; i < reflection.NumField(); i++ {
			thisField := reflection.Field(i)
			thisType := reflection.Type().Field(i)
			tag := thisType.Tag

			// check for our tag on this field
			if val, ok := tag.Lookup(terraformTag); ok {
				// Dereference pointers within the struct to their types
				var thisFieldDeref reflect.Value
				if thisField.Kind() == reflect.Ptr {
					thisFieldDeref = thisField.Elem()
				} else {
					thisFieldDeref = thisField
				}

				// assign map from this struct field
				ret[val] = thisFieldDeref.Interface()
			}
		}
		return ret
	}
	// We don't want to make empty things where there is nothing
	return nil
}

One of the key points of this development process is to avoid sending unnecessary data to the API, since it essentially PATCHes the data you send. These methods return nil unless they have a viable payload to return.

You may also notice some commenting in there about pointers. Yep, you can't just automatically reflect data in and out of a pointer! Who would have thought. Luckily it was fairly easy to work around with some research. In this instance I am checking to see if that struct field type is a pointer, and if it is, fetching the Elem() within into a new reflect.Value object.


// StructFromMap attempts to return a given struct packed with the given map
// should only be used on structs that contain generics int / bool /string
func StructFromMap(model interface{}, m map[string]interface{}) interface{} {
	if m != nil {
		rv := reflect.ValueOf(model).Elem()

		// iterate fields in our interface/struct
		for i := 0; i < rv.NumField(); i++ {
			thisField := rv.Field(i)
			thisType := rv.Type().Field(i)
			tag := thisType.Tag

			// check to make sure it is a tagged field
			if val, ok := tag.Lookup(terraformTag); ok {

				// Dereference pointers within the struct to their types
				var thisFieldDeref reflect.Value
				if thisField.Kind() == reflect.Ptr {
					thisField.Set(reflect.New(thisField.Type().Elem()))
					thisFieldDeref = thisField.Elem()
				} else {
					thisFieldDeref = thisField
				}

				// detect the type and cast our map value to that type in that field
				switch thisFieldDeref.Kind() {
				case reflect.Int:
					if v, ok := m[val]; ok {
						thisFieldDeref.SetInt(int64(v.(int)))
					}
				case reflect.String:
					if v, ok := m[val]; ok {
						thisFieldDeref.SetString(v.(string))
					}
				case reflect.Bool:
					if v, ok := m[val]; ok {
						thisFieldDeref.SetBool(v.(bool))
					}
				default:
					debug.Log("Model Generation", "Something went wrong packing: %v\n", m)
					return nil
				}
			}
		}
		return model
	}
	// We don't want to make empty things where there is nothing
	return nil

}

The reverse process is fairly similar. The first major difference is in the pointer dereference, requiring an extra step to set the type before fetching the Elem() into a new reflect.Value.

The second major difference is the switch block, which allows us to cast the raw map into the struct fields. SetInt was weird to me, being int64 only - but it does make sense to start at the highest resolution that can be cut down if the actual int field is lower (int32/int16/int). As with before, we are returning nil if it's invalid.

Implementation

How am I using this? Well this little blog post is far too short to get into the entirety of my terraform ETLs back and forth, but this little microcosm is being used to reduce THIS:


func expandDeliveryCompression(raw interface{}) *models.Compression {
    if compression, ok := raw.(*schema.Set); ok {
        compressionSet := compression.List()[0]
        if m, ok := compressionSet.(map[string]interface{}); ok {
            c := &models.Compression{}
            if v, ok := m["enabled"]; ok {
                c.Enabled = v.(bool)
            }
            if v, ok := m["gzip"]; ok {
                c.GZIP = v.(string)
            }
            if v, ok := m["level"]; ok {
                c.Level = v.(int)
            }
            if v, ok := m["mime"]; ok {
                c.Mime = v.(string)
            }

            return c
        }
    }
    return nil
}

Into THIS:


func expandDeliveryCompression(raw interface{}) *models.Compression {
    if compression, ok := raw.(*schema.Set); ok {
        compressionSet := compression.List()[0]
        if m, ok := compressionSet.(map[string]interface{}); ok {
            c = &models.Compression{}
            c = models.StructFromMap(c, m).(*models.Compression)
            return c
        }
    }
    return nil
}

It's not going to completely collapse my sprawl yet, but it's a step in the right direction.