Feature #20426
Updated by Peter Amstutz over 1 year ago
The idea is:
* have monitoring running on a node that is unlikely to be affected by Arvados issues
* run the health check aggregator
* have Prometheus check the health checks periodically
* configure alertmanager to send out an email to if the health check fails