Avoid configuration skew between different services and hosts
Background: With multiple back-end service components running on multiple hosts, it is possible to have services running with different configurations. In many cases, this happens by accident, and ends up causing problems for users/clients that are hard to diagnose.Examples:
- If RailsAPI is not restarted after changing /etc/arvados/config.yml, it will continue using the old config -- except that when passenger starts new worker threads, they use the new config.
- If the instance types are updated, and controller is restarted but arvados-dispatch-cloud is not restarted, clients will see that the updated types are available, but scheduling decisions will be made based on the old types.
- If a Keep volume changes from read-only to read-write, and controller/RailsAPI are restarted but the relevant keepstore processes are not restarted, clients will waste time trying to write to the volume (which keepstore will refuse to do) before falling back to different volumes/servers.
- Automatically detect when a version mismatch exists, and report this to the operator (via logs, health checks, metrics)
- Provide an easy mechanism for updating the configuration cluster-wide and signalling all services to restart/reload config as needed, thereby eliminating the most common causes of version mismatches (i.e., the operator fails to update config on all nodes or incorrectly identifies which services need to be restarted)