83 points by dadt 28 days ago
I use grpc-web and protocol buffers.
Renaming fields doesn't matter -- if the client and server disagree about the name of a key, it doesn't matter, because the transport layer uses the tag number, not the name. (Compare this to JSON, where changing the name does break clients.)
Adding fields also doesn't matter, but this is where it starts to get tricky. If a client sends a request without a field that the server expects (i.e., an old version of the client), the message will parse OK, but the server could still say "hey actually that's required, buh bye". If you do this, you lose the backwards compatibility. So don't do that.
Removing fields is something you can basically never do if you want compatibility. You can rename them to deprecated_whatever, though, and see what code still uses them by the fact that they no longer compile. (Binaries using that field can still exist and will continue to work, of course. But at least you can have a transition period where people writing new code will think "hmm, this is probably going away" and won't depend on that field.)
(There are also some additional mechanical details that the protocol buffer documentation talks about. Maybe an int32 isn't big enough so you want an int64. Old clients can still talk to a server that has changed the type of an int32 field to an int64, but eventually it's going to lose data because it didn't allocate enough storage to manipulate an actual 64-bit value. But it does give you time when you think "a year from now this will be bigger than 2^32". You can change the definition today and eventually update all the clients.)
I think with care and the occasional update of a client, though, you can pretty easily keep things compatible forever. This is wayyyyy easier with SPAs than any other sort of API consumer, because you have control over updating the client.
Often you are adding new features, which is the easiest case. You add a new field or RPC, and just start using it.
You can usually structure a change in semantics as a new feature, which makes the cases for which protobuffers excel even more common in practice. For example, say you have clients that depend on the ordering of results from a Lookup() call. You think that that sort is unnecessary and slow, and you want to change the semantics without breaking clients that depend on the ordering. You can just add a new RPC, FastLookup(), and start using that. Clients that use the new RPC will be faster, but old clients will continue to work using the old method. You can update all those (check your monitoring, you probably have a grpc_server_handled_total metric for every method), and after everything's updated, you can safely remove the code that implements Lookup() (either delete it entirely, or to really do it right, return codes.Unimplemented).
I think if you aim for incremental progress, it's pretty easy to achieve with the right tools. It's harder, but possible, even for public APIs where you can't update the client. But where you can update the client and all you have to worry about is browser cache? Easiest possible case ;)
You can also remove a field by reserving the tag number and optionally the name. New binaries will fail to compile but previously persisted values can still be read back by old binaries. Also ensures no one uses the same tag number later and reads bad data.
Indeed, this over 30 years old problem is easily fixable by following guidelines on protocol buffers. Or any other similar technology of this sort, like ASN.1 if you like antics. Protos are just quite popular and therefore a safe commodity bet.
Ideally, version your APIs, though for some internal APIs you definitely want to be able to make changes without bumping versions constantly.
Be backwards and forward compatible as much as possible. Rolling deployments and other factors virtually guarantee mismatched versions in both directions. Many things, like adding new fields or removing old ones, can be staged in such a way that nothing breaks. Protobuf/gRPC offers some form of backwards and forwards compatibility (at the data model layer, of course,) if you adhere to certain basic invariants.
Almost any change can be staged over time, how long you want to keep compatibility between versions is up to you.
Oh, probably most important: keep clear data model separations wherever you can. Between storage and API, and API and in-memory state management. It’s a lot of work, but it pays dividends.
Versioning your APIs gives you a built-in metric for how often you make breaking changes. Not everyone likes seeing this, but if you can enforce it, I think it creates a psychological incentive to exercise discipline about API changes.
Not sure if this is standard, but what I usually do is bump the version only on a breaking change. So if I add a field, nothing new happens. Remove a field, or fundumentally change the output, then I bump it up.
Would you care how often you make breaking changes on your API if you control the client and it's your only consumer?
Yes, if especially if "you" is a company or engineering department. It can be difficult trying to figure out how to best deprecate a micro-service endpoint if you aren't quite sure what other services are using it and if logging is lacking.
The question is about web apps, Protobuf/gRPC don't apply. Your comment is speaking in generalities, the question is about what you actually do. Of course you want to preserve backward/forward combat as much as possible, all else being equal.
> A number of responses say "have the client detect that it's running an old version and force reload." That works, but it's pretty annoying UX.
Is it though? Of all reasons to force refresh, this sounds reasonable to me as it won't even happen that often.
Heck, I would want to refresh if I knew a new version was available.
Is it though?
And whether your SPA is something like Google Docs where a user could plausibly have the same document open for a week or more.
Is it common to be releasing breaking API schema changes 6 times a day?
Intentionally? Probably not.
No, but it's common to underestimate the cost imposed on clients how avoidable a breaking change is.
I'm not sure if that's a good example. Google Docs automatically saves the document when there's a change, so you won't loose any work if you refresh.
I may be misremembering but I actually remember GDocs telling me to reload once, so presumably they have some way of forcing reloading if they need it (I think it let me keep working locally until I reloaded, then all my stuff would be saved and synced like normal).
make it silent? like chrome update?
silent download and auto install on next startup
Even Facebook does this when you have left it open in a tab for too long.
: spellcheck failed.
> Even Facebook
Like they're the example we should look to. They do all sorts of hacky stuff.
I lol'd, Facebook has a terrible UI on every platform.
They do lots of stuff, but they don't seem to break their user's usage due to reloading if you leave a window open forever.
The simplest answer is "don't make breaking API schema changes".
For a more complete answer, this is a great blog post from someone at Stripe about API versioning: https://stripe.com/en-ca/blog/api-versioning
When you need to make a backward incompatible change, do it in three steps:
1. Add new stuff to API in a backwards compatible manner (without removing or changing old stuff). When you need to add fields it is usually safe to do it within existing functions. When you need to modify fields, I recommend simply copying and pasting your API function and exposing the new one with a number suffix, e.g. get_messages_2
2. Update the client code to use new versions of functions and to stop using old versions.
3. Once the new client is deployed, wait a while longer and then remove old versions of functions like get_messages_1
You can also use a single global version which will allow your client to detect when it has gone stale and reload.
What does this have to do with SPAs specifically? Any of the usual suspects for versioning APIs (URL paths, headers, etc) should suffice, right? The SPA is just another client.
SPAs as opposed to mostly-static web pages. I think all "rich clients" are implicitly included, although the "refresh" story does change for native ones.
We use API versioning. Works like a charm! If you don't currently have API versioning, just put the breaking changes behind /v2/<endpoint> and start versioning.
I guess the hard thing is that you lose versioning granularity at that level. If you made a mistake with your v2 rollout and the fix is breaking, do you bump to v3 or append the correct field? Do you leave the bug in the response still so that users can decide to depend on it? You can’t delete. What would the user thing about bumping from v3 to v12 in a really short amount of time as the API is in flux?
Stripe’s date versioning is the best implementation I’ve seen yet. The worst I’ve seen is using custom MIME types. Version number in the URL is naive but intuitive.
I've never come across your scenario. I would say if you need to fix v2 with a breaking change, then v2 must be broken to begin with. All that being said, going to v3 is not a problem either. I have a few APIs where we went up to v7 after 2 years of operation without issues.
This can be used to trigger a full browser refresh when the application topology had changed dramatically.
Yeah, and if you do dynamic loading of resources (images, templates, etc), I think the window of 404ing is much longer than just the index.html parse time — essentially, the span of a user session...
Edge caching is a solution.
Yeah, I think that's a reasonable solution. It doesn't guarantee 100% accuracy, but it increases the chance that something will load.
This doesn’t magically cure everything, but switching from REST to GraphQL can help make a lot of versioning-type issues disappear.
absolutely true in my experience also