Story #9502

Updated by Brett Smith over 4 years ago

Today the API server holds its expanded permissions graph in the Rails cache. Any request that needs the permissions graph when it isn't in the cache will generate it. Writes that affect the permissions graph invalidate this change.

Move to a model where we generate the graph on startup, and after each write that affects the graph, before returning a result for that write request. Functional requirements:

* The API server always has an expanded permissions graph cached. It generates one at startup. When it handles a write request that changes the permissions graph, before it returns the result of the request, it generates a new permissions graph that atomically replaces the old one.
* When the async_permissions_update setting is true, incoming requests use whatever copy of the permissions graph is currently complete, without waiting for updates. When the setting is false, if the request needs to use the permissions graph while an update is being prepared, the request waits for that update to finish, then uses the new graph. This wait happens at most once per request.
* Only one update should run at a time, and each update should unblock any write requests that made their underlying database update before the graph rebuild started. For illustration, the implementation should allow this timeline of events:
*# API server receives write request A.
*# Write request A updates the database.
*# API server begins updating the permissions graph.
*# Write request B comes in and updates the database.
*# Write request C comes in and updates the database.
*# Permissions graph update finishes.
*# Send the result for write request A.
*# API server begins updating the permissions graph.
*# Permissions graph update finishes.
*# Send the result for write requests B and C.

The primary motivation for this branch is to make #9186 more practical. Right now we know it will break clients that make a permissions change, then make an API request that relies on that permissions change to be effective. The idea here is that blocking writes on permission updates should make it possible for those clients to continue working unmodified, without blocking large numbers of readers on the graph update.

We believe this change will also provide performance improvements when async permissions updates are not enabled, just by avoiding redundant graph rebuilds, but that's less of a priority.