Project

General

Profile

Bug #9924

Updated by Peter Amstutz over 6 years ago

If slurm thinks that a node has failed, it may revoke crunch-job's allocation.    When this happens, crunch-job may detect it as a "tempfail" but it is impossible for crunch-job to recover.    When this happens, crunch-dispatch should restart the job with a new allocation.

Back