


Feature #17185

Updated by Ward Vandewege over 3 years ago

 Add a broken-node metric 

 (counter) VMs that are determined to be "broken nodes" 

 Add a label to separate VMs marked as broken before the first container is started on them (likely boot problem) and after (likely container related problem). 

 Note that we already have a boot outcome metric. Make sure that we increment the broken node counter ("before first container" label) when we have a boot outcome == failed, though not in the timeout case. failed.
