Project

General

Profile

Actions

Bug #19429

open

Somehow detect that scratch space failed to mount

Added by Peter Amstutz almost 2 years ago. Updated 3 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Crunch
Target version:
Story points:
-
Release:
Release relationship:
Auto

Description

A failure mode with cloud compute nodes is that the scratch space can fail to mount due to misconfiguration. When this happens, jobs may mysteriously run out of disk space and it can be very difficult to debug.

Maybe "arvados-server cloudtest" could include a check that an expected amount of disk space is available, or even check on AWS that we have the right IAM permissions.

Actions

Also available in: Atom PDF