Project

General

Profile

Feature #18768

Updated by Peter Amstutz about 2 years ago

Create an implementation plan to automatically detect when a version mismatch of the config exists among the various active services, and report this to the operator (via logs, health checks, metrics). 

 * during operation, service checks for config changes.    if the config changes, the service loads the config file and validates it.    It then adds a health check warning that the config file on disk does not match the config file in memory.    If the config file failed validation (which means the service would fail if restarted), it should report that as well 
 * prometheus metric reports 0 or 1 whether the config on disk matches the config in memory 
 * health check reports md5sum and timestamp of the config file on disk 
 ** health check aggregator can check if the sums don't match 
 * add a command line tool to arvados-client which uses the same logic as the health check aggregator to report the health check results of all the services 
 * the public config published by controller should include a timestamp for config last modified time 

 This phase of implementation is for reporting/detecting config changes only, not responding to them. 

Back