Bug #4044

Updated by Ward Vandewege almost 6 years ago

The first task in job 9tee4-8i9sb-jykwrdj3quy00t5 create a whole lot of new tasks. Then it exits (succesfully, no docker container running, and gets marked as such in the api server) but somehow the job does not realize that, and does not get updated. The job then sits around waiting for the first task to finish.

The job record is stuck with 1 running, 0 todo.

All the job tasks are created properly though. The first one, as an example, is 9tee4-ot0gb-vmlrb41ai6iswlr.

I verified that while the job is stuck, it is not trying to read data from Keep.

Note: this particular job has a malformed input collection (the files are in a flat hierarchy, rather than split up per directory). It will never complete successfully. So the bug is that Crunch gets into this weird state with a stuck job; it should fail explicitly if it can't run the tasks.

This happened again on job 9tee4-8i9sb-50r7da1j0ks4db1