Project

General

Profile

Feature #20426

Updated by Peter Amstutz 12 months ago

The idea is: 

 * have monitoring running on a node that is unlikely to be affected by Arvados issues 
 * run the health check aggregator 
 * have Prometheus check the health checks periodically 
 * configure alertmanager to send out an email to if the health check fails 

Back