Project

General

Profile

Feature #8018

Updated by Peter Amstutz about 8 years ago

I think we want at least three failure modes: 

 *Error* (there was an infrastructure error, the job should always retry) 

 *Invalid* (there's something invalid in the container record, most of the time this should be prevented by API server validation, but if not, some other component can mark the container as impossible to fulfill) 

 *Lost* (we've lost track of the container, it is possible it is still running somewhere and will complete, however we should go ahead retry it just in case) (this probably requires a heatbeat from containers) 

Back