Project

General

Profile

Bug #15164

Updated by Peter Amstutz almost 5 years ago

CWL tests sometimes get stuck and then time out. 

 It does not happen consistently.    Experimentally, running tests with -j5 seems to increase the odds of running into it.    I've specifically noticed it with the arvbox-based tests running locally and on ci.commonwl.org.    I'm not sure if I have observed it on the dev clusters. 

 What seems to happen is there is a container request where the underlying container is completed, but the container request is not finalized.    The container request remains in "Committed" state, and the output or logs are not set.    As a result, the workflow runner becomes cannot make progress. 

Back