How do most SPAs handle breaking API schema changes? (twitter.com)

83 points by dadt 28 days ago

36 comments

jrockway 26 days ago

I use grpc-web and protocol buffers.

Renaming fields doesn't matter -- if the client and server disagree about the name of a key, it doesn't matter, because the transport layer uses the tag number, not the name. (Compare this to JSON, where changing the name does break clients.)

Adding fields also doesn't matter, but this is where it starts to get tricky. If a client sends a request without a field that the server expects (i.e., an old version of the client), the message will parse OK, but the server could still say "hey actually that's required, buh bye". If you do this, you lose the backwards compatibility. So don't do that.

Removing fields is something you can basically never do if you want compatibility. You can rename them to deprecated_whatever, though, and see what code still uses them by the fact that they no longer compile. (Binaries using that field can still exist and will continue to work, of course. But at least you can have a transition period where people writing new code will think "hmm, this is probably going away" and won't depend on that field.)

(There are also some additional mechanical details that the protocol buffer documentation talks about. Maybe an int32 isn't big enough so you want an int64. Old clients can still talk to a server that has changed the type of an int32 field to an int64, but eventually it's going to lose data because it didn't allocate enough storage to manipulate an actual 64-bit value. But it does give you time when you think "a year from now this will be bigger than 2^32". You can change the definition today and eventually update all the clients.)

I think with care and the occasional update of a client, though, you can pretty easily keep things compatible forever. This is wayyyyy easier with SPAs than any other sort of API consumer, because you have control over updating the client.

Often you are adding new features, which is the easiest case. You add a new field or RPC, and just start using it.

You can usually structure a change in semantics as a new feature, which makes the cases for which protobuffers excel even more common in practice. For example, say you have clients that depend on the ordering of results from a Lookup() call. You think that that sort is unnecessary and slow, and you want to change the semantics without breaking clients that depend on the ordering. You can just add a new RPC, FastLookup(), and start using that. Clients that use the new RPC will be faster, but old clients will continue to work using the old method. You can update all those (check your monitoring, you probably have a grpc_server_handled_total metric for every method), and after everything's updated, you can safely remove the code that implements Lookup() (either delete it entirely, or to really do it right, return codes.Unimplemented).

I think if you aim for incremental progress, it's pretty easy to achieve with the right tools. It's harder, but possible, even for public APIs where you can't update the client. But where you can update the client and all you have to worry about is browser cache? Easiest possible case ;)

    tdhoot 26 days ago

    You can also remove a field by reserving the tag number and optionally the name. New binaries will fail to compile but previously persisted values can still be read back by old binaries. Also ensures no one uses the same tag number later and reads bad data.

    H8crilA 26 days ago

    Indeed, this over 30 years old problem is easily fixable by following guidelines on protocol buffers. Or any other similar technology of this sort, like ASN.1 if you like antics. Protos are just quite popular and therefore a safe commodity bet.

jchw 28 days ago

Ideally, version your APIs, though for some internal APIs you definitely want to be able to make changes without bumping versions constantly.

Be backwards and forward compatible as much as possible. Rolling deployments and other factors virtually guarantee mismatched versions in both directions. Many things, like adding new fields or removing old ones, can be staged in such a way that nothing breaks. Protobuf/gRPC offers some form of backwards and forwards compatibility (at the data model layer, of course,) if you adhere to certain basic invariants.

Almost any change can be staged over time, how long you want to keep compatibility between versions is up to you.

Oh, probably most important: keep clear data model separations wherever you can. Between storage and API, and API and in-memory state management. It’s a lot of work, but it pays dividends.

    dkarl 26 days ago

    Versioning your APIs gives you a built-in metric for how often you make breaking changes. Not everyone likes seeing this, but if you can enforce it, I think it creates a psychological incentive to exercise discipline about API changes.

    swalsh 26 days ago

    Not sure if this is standard, but what I usually do is bump the version only on a breaking change. So if I add a field, nothing new happens. Remove a field, or fundumentally change the output, then I bump it up.

      ellyagg 26 days ago

      Would you care how often you make breaking changes on your API if you control the client and it's your only consumer?

        chris11 26 days ago

        Yes, if especially if "you" is a company or engineering department. It can be difficult trying to figure out how to best deprecate a micro-service endpoint if you aren't quite sure what other services are using it and if logging is lacking.

    draw_down 26 days ago

    The question is about web apps, Protobuf/gRPC don't apply. Your comment is speaking in generalities, the question is about what you actually do. Of course you want to preserve backward/forward combat as much as possible, all else being equal.

Kuraj 27 days ago

> A number of responses say "have the client detect that it's running an old version and force reload." That works, but it's pretty annoying UX.

Is it though? Of all reasons to force refresh, this sounds reasonable to me as it won't even happen that often.

Heck, I would want to refresh if I knew a new version was available.

    michaelt 26 days ago

      Is it though?
    
    Depends if you're triggering it six times a year or six times a day. Some companies pride themselves on how often they deploy code to production [1].

    And whether your SPA is something like Google Docs where a user could plausibly have the same document open for a week or more.

    [1] https://blog.newrelic.com/technology/data-culture-survey-res...

      ollyculverhouse 26 days ago

      Is it common to be releasing breaking API schema changes 6 times a day?

        Andrex 26 days ago

        Intentionally? Probably not.

        SilasX 26 days ago

        No, but it's common to underestimate the cost imposed on clients how avoidable a breaking change is.

      ihuman 26 days ago

      I'm not sure if that's a good example. Google Docs automatically saves the document when there's a change, so you won't loose any work if you refresh.

        chipperyman573 26 days ago

        I may be misremembering but I actually remember GDocs telling me to reload once, so presumably they have some way of forcing reloading if they need it (I think it let me keep working locally until I reloaded, then all my stuff would be saved and synced like normal).

      tuananh 26 days ago

      make it silent? like chrome update?

      silent download and auto install on next startup

    chrisabrams 27 days ago

    Even Facebook does this when you have left it open in a tab for too long.

    [edit]: spellcheck failed.

      itslennysfault 26 days ago

      > Even Facebook

      Like they're the example we should look to. They do all sorts of hacky stuff.

        duxup 26 days ago

        They do lots of stuff, but they don't seem to break their user's usage due to reloading if you leave a window open forever.

boronine 27 days ago

When you need to make a backward incompatible change, do it in three steps:

1. Add new stuff to API in a backwards compatible manner (without removing or changing old stuff). When you need to add fields it is usually safe to do it within existing functions. When you need to modify fields, I recommend simply copying and pasting your API function and exposing the new one with a number suffix, e.g. get_messages_2

2. Update the client code to use new versions of functions and to stop using old versions.

3. Once the new client is deployed, wait a while longer and then remove old versions of functions like get_messages_1

You can also use a single global version which will allow your client to detect when it has gone stale and reload.

bradstewart 26 days ago

What does this have to do with SPAs specifically? Any of the usual suspects for versioning APIs (URL paths, headers, etc) should suffice, right? The SPA is just another client.

    brundolf 26 days ago

    SPAs as opposed to mostly-static web pages. I think all "rich clients" are implicitly included, although the "refresh" story does change for native ones.

beiller 26 days ago

We use API versioning. Works like a charm! If you don't currently have API versioning, just put the breaking changes behind /v2/<endpoint> and start versioning.

    ljm 26 days ago

    I guess the hard thing is that you lose versioning granularity at that level. If you made a mistake with your v2 rollout and the fix is breaking, do you bump to v3 or append the correct field? Do you leave the bug in the response still so that users can decide to depend on it? You can’t delete. What would the user thing about bumping from v3 to v12 in a really short amount of time as the API is in flux?

    Stripe’s date versioning is the best implementation I’ve seen yet. The worst I’ve seen is using custom MIME types. Version number in the URL is naive but intuitive.

      beiller 18 days ago

      I've never come across your scenario. I would say if you need to fix v2 with a breaking change, then v2 must be broken to begin with. All that being said, going to v3 is not a problem either. I have a few APIs where we went up to v7 after 2 years of operation without issues.

033803throwaway 26 days ago

I don't know if most SPA frameworks offer this sort of functionality, but with intercooler.js you can send response headers to trigger client-side events (or, if you are feeling hacky, to evaluate raw javascript.)

This can be used to trigger a full browser refresh when the application topology had changed dramatically.

27 days ago

aaronharnly 26 days ago

How do you even handle SPA updates themselves? Apps may reference static resources (JavaScript, CSS, images) which get updated with a new version. Those resources could be clobbered by a deploy, or be uglified, or be versioned. Do you keep the old static resources in place forever, or for a period of time? Or force a reload? Or do you just let clients break and make users reload?

    jrockway 26 days ago

    Yeah, the tooling around this is pretty bad. For example, you might release a container that has index.html and main.abc1234.js. index.html has to be parsed before the JS bundle is requested. If you do a release in that intermediate time, the javascript will 404 and your site won't load because in the new container, the only bundle that exists is main.def2345.js.

    I think people additionally assume this never happens, because their error reporting code lives in the javascript bundle and they never get error reports ;)

    The correct solution is probably to keep a few old versions of the Javascript bundle around, so that in-flight requests succeed even as you update the container hosting the app. I do not know of a tool that does this, but the edge case I describe above worries me, so I might write one someday.

      aaronharnly 26 days ago

      Yeah, and if you do dynamic loading of resources (images, templates, etc), I think the window of 404ing is much longer than just the index.html parse time — essentially, the span of a user session...

      moltar 26 days ago

      Edge caching is a solution.

        jrockway 25 days ago

        Yeah, I think that's a reasonable solution. It doesn't guarantee 100% accuracy, but it increases the chance that something will load.

projectileboy 26 days ago

This doesn’t magically cure everything, but switching from REST to GraphQL can help make a lot of versioning-type issues disappear.

    undoware 26 days ago

    absolutely true in my experience also