Project

General

Profile

Feature #4598

Updated by Tom Clegg over 9 years ago

For a given time interval 
 * Get a list of jobs created in this interval 
 * Report #succeeded, #failed, #unfinished 
 * For each failed job, examine the log file to classify failure 
 ** Find first permanent task failure 
 ** Find last few log messages from that task 
 ** Match against a list of telltale regexps like @/Cannot destroy container/@ and assign failure code like @"sys/docker"@ (this can be very short at first, we'll refine it over time) 
 * Report number of jobs for each failure code 

 We'll use (and refine) this by picking the largest class(es) of failure codes (sometimes including "unknown"), modifying the regexp list to get more helpful/specific error codes, fixing bugs, improving docs, etc. 

Back