Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 1 | # A new Go API for Protocol Buffers |
| 2 | 2 Mar 2020 |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 3 | Tags: protobuf, technical |
Russ Cox | faf1e2d | 2020-03-14 09:44:01 -0400 | [diff] [blame] | 4 | Summary: Announcing a major revision of the Go API for protocol buffers. |
Russ Cox | 972d42d | 2020-03-15 15:50:36 -0400 | [diff] [blame] | 5 | OldURL: /a-new-go-api-for-protocol-buffers |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 6 | |
Russ Cox | 972d42d | 2020-03-15 15:50:36 -0400 | [diff] [blame] | 7 | Joe Tsai |
| 8 | |
| 9 | Damien Neil |
| 10 | |
| 11 | Herbie Ong |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 12 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 13 | ## Introduction |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 14 | |
| 15 | We are pleased to announce the release of a major revision of the Go API for |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 16 | [protocol buffers](https://developers.google.com/protocol-buffers), |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 17 | Google's language-neutral data interchange format. |
| 18 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 19 | ## Motivations for a new API |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 20 | |
| 21 | The first protocol buffer bindings for Go were |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 22 | [announced by Rob Pike](https://blog.golang.org/third-party-libraries-goprotobuf-and) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 23 | in March of 2010. Go 1 would not be released for another two years. |
| 24 | |
| 25 | In the decade since that first release, the package has grown and |
| 26 | developed along with Go. Its users' requirements have grown too. |
| 27 | |
| 28 | Many people want to write programs that use reflection to examine protocol |
| 29 | buffer messages. The |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 30 | [`reflect`](https://pkg.go.dev/reflect) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 31 | package provides a view of Go types and |
| 32 | values, but omits information from the protocol buffer type system. For |
| 33 | example, we might want to write a function that traverses a log entry and |
| 34 | clears any field annotated as containing sensitive data. The annotations |
| 35 | are not part of the Go type system. |
| 36 | |
| 37 | Another common desire is to use data structures other than the ones |
| 38 | generated by the protocol buffer compiler, such as a dynamic message type |
| 39 | capable of representing messages whose type is not known at compile time. |
| 40 | |
| 41 | We also observed that a frequent source of problems was that the |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 42 | [`proto.Message`](https://pkg.go.dev/github.com/golang/protobuf/proto?tab=doc#Message) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 43 | interface, which identifies values of generated message types, does very |
| 44 | little to describe the behavior of those types. When users create types |
| 45 | that implement that interface (often inadvertently by embedding a message |
| 46 | in another struct) and pass values of those types to functions expecting |
| 47 | a generated message value, programs crash or behave unpredictably. |
| 48 | |
| 49 | All three of these problems have a common cause, and a common solution: |
| 50 | The `Message` interface should fully specify the behavior of a message, |
| 51 | and functions operating on `Message` values should freely accept any |
| 52 | type that correctly implements the interface. |
| 53 | |
| 54 | Since it is not possible to change the existing definition of the |
| 55 | `Message` type while keeping the package API compatible, we decided that |
| 56 | it was time to begin work on a new, incompatible major version of the |
| 57 | protobuf module. |
| 58 | |
| 59 | Today, we're pleased to release that new module. We hope you like it. |
| 60 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 61 | ## Reflection |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 62 | |
| 63 | Reflection is the flagship feature of the new implementation. Similar |
| 64 | to how the `reflect` package provides a view of Go types and values, the |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 65 | [`google.golang.org/protobuf/reflect/protoreflect`](https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect?tab=doc) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 66 | package provides a view of values according to the protocol buffer |
| 67 | type system. |
| 68 | |
| 69 | A complete description of the `protoreflect` package would run too long |
| 70 | for this post, but let's look at how we might write the log-scrubbing |
| 71 | function we mentioned previously. |
| 72 | |
| 73 | First, we'll write a `.proto` file defining an extension of the |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 74 | [`google.protobuf.FieldOptions`](https://github.com/protocolbuffers/protobuf/blob/b96241b1b716781f5bc4dc25e1ebb0003dfaba6a/src/google/protobuf/descriptor.proto#L509) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 75 | type so we can annotate fields as containing |
| 76 | sensitive information or not. |
| 77 | |
Russ Cox | 9dd3d9b | 2020-03-09 23:19:59 -0400 | [diff] [blame] | 78 | syntax = "proto3"; |
| 79 | import "google/protobuf/descriptor.proto"; |
| 80 | package golang.example.policy; |
| 81 | extend google.protobuf.FieldOptions { |
| 82 | bool non_sensitive = 50000; |
| 83 | } |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 84 | |
| 85 | We can use this option to mark certain fields as non-sensitive. |
| 86 | |
Russ Cox | 9dd3d9b | 2020-03-09 23:19:59 -0400 | [diff] [blame] | 87 | message MyMessage { |
| 88 | string public_name = 1 [(golang.example.policy.non_sensitive) = true]; |
| 89 | } |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 90 | |
| 91 | Next, we will write a Go function which accepts an arbitrary message |
| 92 | value and removes all the sensitive fields. |
| 93 | |
Russ Cox | 9dd3d9b | 2020-03-09 23:19:59 -0400 | [diff] [blame] | 94 | // Redact clears every sensitive field in pb. |
| 95 | func Redact(pb proto.Message) { |
| 96 | // ... |
| 97 | } |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 98 | |
| 99 | This function accepts a |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 100 | [`proto.Message`](https://pkg.go.dev/google.golang.org/protobuf/proto?tab=doc#Message), |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 101 | an interface type implemented by all generated message types. This type |
| 102 | is an alias for one defined in the `protoreflect` package: |
| 103 | |
Russ Cox | 9dd3d9b | 2020-03-09 23:19:59 -0400 | [diff] [blame] | 104 | type ProtoMessage interface{ |
| 105 | ProtoReflect() Message |
| 106 | } |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 107 | |
| 108 | To avoid filling up the namespace of generated |
| 109 | messages, the interface contains only a single method returning a |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 110 | [`protoreflect.Message`](https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect?tab=doc#Message), |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 111 | which provides access to the message contents. |
| 112 | |
| 113 | (Why an alias? Because `protoreflect.Message` has a corresponding |
| 114 | method returning the original `proto.Message`, and we need to avoid an |
| 115 | import cycle between the two packages.) |
| 116 | |
| 117 | The |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 118 | [`protoreflect.Message.Range`](https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect?tab=doc#Message.Range) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 119 | method calls a function for every populated field in a message. |
| 120 | |
Russ Cox | 9dd3d9b | 2020-03-09 23:19:59 -0400 | [diff] [blame] | 121 | m := pb.ProtoReflect() |
| 122 | m.Range(func(fd protoreflect.FieldDescriptor, v protoreflect.Value) bool { |
| 123 | // ... |
| 124 | return true |
| 125 | }) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 126 | |
| 127 | The range function is called with a |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 128 | [`protoreflect.FieldDescriptor`](https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect?tab=doc#FieldDescriptor) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 129 | describing the protocol buffer type of the field, and a |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 130 | [`protoreflect.Value`](https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect?tab=doc#Value) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 131 | containing the field value. |
| 132 | |
| 133 | The |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 134 | [`protoreflect.FieldDescriptor.Options`](https://pkg.go.dev/google.golang.org/protobuf/reflect/protoreflect?tab=doc#Descriptor.Options) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 135 | method returns the field options as a `google.protobuf.FieldOptions` |
| 136 | message. |
| 137 | |
Russ Cox | 9dd3d9b | 2020-03-09 23:19:59 -0400 | [diff] [blame] | 138 | opts := fd.Options().(*descriptorpb.FieldOptions) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 139 | |
| 140 | (Why the type assertion? Since the generated `descriptorpb` package |
| 141 | depends on `protoreflect`, the `protoreflect` package can't return the |
| 142 | concrete options type without causing an import cycle.) |
| 143 | |
| 144 | We can then check the options to see the value of our extension boolean: |
| 145 | |
Russ Cox | 9dd3d9b | 2020-03-09 23:19:59 -0400 | [diff] [blame] | 146 | if proto.GetExtension(opts, policypb.E_NonSensitive).(bool) { |
| 147 | return true // don't redact non-sensitive fields |
| 148 | } |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 149 | |
| 150 | Note that we are looking at the field _descriptor_ here, not the field |
| 151 | _value_. The information we're interested in lies in the protocol |
| 152 | buffer type system, not the Go one. |
| 153 | |
| 154 | This is also an example of an area where we |
| 155 | have simplified the `proto` package API. The original |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 156 | [`proto.GetExtension`](https://pkg.go.dev/github.com/golang/protobuf/proto?tab=doc#GetExtension) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 157 | returned both a value and an error. The new |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 158 | [`proto.GetExtension`](https://pkg.go.dev/google.golang.org/protobuf/proto?tab=doc#GetExtension) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 159 | returns just a value, returning the default value for the field if it is |
| 160 | not present. Extension decoding errors are reported at `Unmarshal` time. |
| 161 | |
| 162 | Once we have identified a field that needs redaction, clearing it is simple: |
| 163 | |
Russ Cox | 9dd3d9b | 2020-03-09 23:19:59 -0400 | [diff] [blame] | 164 | m.Clear(fd) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 165 | |
| 166 | Putting all the above together, our complete redaction function is: |
| 167 | |
Russ Cox | 9dd3d9b | 2020-03-09 23:19:59 -0400 | [diff] [blame] | 168 | // Redact clears every sensitive field in pb. |
| 169 | func Redact(pb proto.Message) { |
| 170 | m := pb.ProtoReflect() |
| 171 | m.Range(func(fd protoreflect.FieldDescriptor, v protoreflect.Value) bool { |
| 172 | opts := fd.Options().(*descriptorpb.FieldOptions) |
| 173 | if proto.GetExtension(opts, policypb.E_NonSensitive).(bool) { |
| 174 | return true |
| 175 | } |
| 176 | m.Clear(fd) |
| 177 | return true |
| 178 | }) |
| 179 | } |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 180 | |
| 181 | A more complete implementation might recursively descend into |
| 182 | message-valued fields. We hope that this simple example gives a |
| 183 | taste of protocol buffer reflection and its uses. |
| 184 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 185 | ## Versions |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 186 | |
| 187 | We call the original version of Go protocol buffers APIv1, and the |
| 188 | new one APIv2. Because APIv2 is not backwards compatible with APIv1, |
| 189 | we need to use different module paths for each. |
| 190 | |
| 191 | (These API versions are not the same as the versions of the protocol |
| 192 | buffer language: `proto1`, `proto2`, and `proto3`. APIv1 and APIv2 |
| 193 | are concrete implementations in Go that both support the `proto2` and |
| 194 | `proto3` language versions.) |
| 195 | |
| 196 | The |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 197 | [`github.com/golang/protobuf`](https://pkg.go.dev/github.com/golang/protobuf?tab=overview) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 198 | module is APIv1. |
| 199 | |
| 200 | The |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 201 | [`google.golang.org/protobuf`](https://pkg.go.dev/google.golang.org/protobuf?tab=overview) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 202 | module is APIv2. We have taken advantage of the need to change the |
| 203 | import path to switch to one that is not tied to a specific hosting |
| 204 | provider. (We considered `google.golang.org/protobuf/v2`, to make it |
| 205 | clear that this is the second major version of the API, but settled on |
| 206 | the shorter path as being the better choice in the long term.) |
| 207 | |
| 208 | We know that not all users will move to a new major version of a package |
| 209 | at the same rate. Some will switch quickly; others may remain on the old |
| 210 | version indefinitely. Even within a single program, some parts may use |
| 211 | one API while others use another. It is essential, therefore, that we |
| 212 | continue to support programs that use APIv1. |
| 213 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 214 | - `github.com/golang/protobuf@v1.3.4` is the most recent pre-APIv2 version of APIv1. |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 215 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 216 | - `github.com/golang/protobuf@v1.4.0` is a version of APIv1 implemented in terms of APIv2. |
| 217 | The API is the same, but the underlying implementation is backed by the new one. |
| 218 | This version contains functions to convert between the APIv1 and APIv2 `proto.Message` |
| 219 | interfaces to ease the transition between the two. |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 220 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 221 | - `google.golang.org/protobuf@v1.20.0` is APIv2. |
| 222 | This module depends upon `github.com/golang/protobuf@v1.4.0`, |
| 223 | so any program which uses APIv2 will automatically pick a version of APIv1 |
| 224 | which integrates with it. |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 225 | |
Russ Cox | 482079d | 2020-03-09 22:11:04 -0400 | [diff] [blame] | 226 | (Why start at version `v1.20.0`? To provide clarity. |
| 227 | We do not anticipate APIv1 to ever reach `v1.20.0`, |
| 228 | so the version number alone should be enough to unambiguously differentiate |
| 229 | between APIv1 and APIv2.) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 230 | |
| 231 | We intend to maintain support for APIv1 indefinitely. |
| 232 | |
| 233 | This organization ensures that any given program will use only a single |
| 234 | protocol buffer implementation, regardless of which API version it uses. |
| 235 | It permits programs to adopt the new API gradually, or not at all, while |
| 236 | still gaining the advantages of the new implementation. The principle of |
| 237 | minimum version selection means that programs may remain on the old |
| 238 | implementation until the maintainers choose to update to the new one |
| 239 | (either directly, or by updating a dependency). |
| 240 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 241 | ## Additional features of note |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 242 | |
| 243 | The |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 244 | [`google.golang.org/protobuf/encoding/protojson`](https://pkg.go.dev/google.golang.org/protobuf/encoding/protojson) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 245 | package converts protocol buffer messages to and from JSON using the |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 246 | [canonical JSON mapping](https://developers.google.com/protocol-buffers/docs/proto3#json), |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 247 | and fixes a number of issues with the old `jsonpb` package |
| 248 | that were difficult to change without causing problems for existing users. |
| 249 | |
| 250 | The |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 251 | [`google.golang.org/protobuf/types/dynamicpb`](https://pkg.go.dev/google.golang.org/protobuf/types/dynamicpb) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 252 | package provides an implementation of `proto.Message` for messages whose |
| 253 | protocol buffer type is derived at runtime. |
| 254 | |
| 255 | The |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 256 | [`google.golang.org/protobuf/testing/protocmp`](https://pkg.go.dev/google.golang.org/protobuf/testing/protocmp) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 257 | package provides functions to compare protocol buffer messages with the |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 258 | [`github.com/google/cmp`](https://pkg.go.dev/github.com/google/go-cmp/cmp) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 259 | package. |
| 260 | |
| 261 | The |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 262 | [`google.golang.org/protobuf/compiler/protogen`](https://pkg.go.dev/google.golang.org/protobuf/compiler/protogen?tab=doc) |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 263 | package provides support for writing protocol compiler plugins. |
| 264 | |
Russ Cox | af5018f | 2020-03-09 23:54:35 -0400 | [diff] [blame] | 265 | ## Conclusion |
Damien Neil | 5365b3b | 2020-02-20 15:38:55 -0800 | [diff] [blame] | 266 | |
| 267 | The `google.golang.org/protobuf` module is a major overhaul of |
| 268 | Go's support for protocol buffers, providing first-class support |
| 269 | for reflection, custom message implementations, and a cleaned up API |
| 270 | surface. We intend to maintain the previous API indefinitely as a wrapper |
| 271 | of the new one, allowing users to adopt the new API incrementally at |
| 272 | their own pace. |
| 273 | |
| 274 | Our goal in this update is to improve upon the benefits of the old |
| 275 | API while addressing its shortcomings. As we completed each component of |
| 276 | the new implementation, we put it into use within Google's codebase. This |
| 277 | incremental rollout has given us confidence in both the usability of the new |
| 278 | API and the performance and correctness of the new implementation. We believe |
| 279 | it is production ready. |
| 280 | |
| 281 | We are excited about this release and hope that it will serve the Go |
| 282 | ecosystem for the next ten years and beyond! |