Project

General

Profile

Actions

Feature #4579

open

[Documentation] Run-command docs should remind user how & why to exit non-zero on failure.

Added by Bryan Cosca over 9 years ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
Documentation
Target version:
Story points:
0.5
Release:
Release relationship:
Auto

Description

Some jobs encounter errors that seem like they should be fatal errors, but still report job success.

For example, qr1hi-8i9sb-mtxaffgfw6athnp:

grep: //c09a19ea17f72c8da97f8cb64a9b333b+743: No such file or directory

or qr1hi-8i9sb-gn0jmhwp88j3a8z:

ls: cannot access /keep//keep/c09a19ea17f72c8da97f8cb64a9b333b+743/*.vcf: No such file or directory

These jobs should report failure.


Related issues

Related to Arvados - Idea #3044: [Documentation] Improve documentation for authoring crunch scriptsClosedActions
Actions #1

Updated by Tim Pierce over 9 years ago

  • Subject changed from Crunch is able to detect unique errors within scripts? to [Crunch] failed jobs are incorrectly reported as succeeding
  • Description updated (diff)
  • Category set to Crunch
Actions #2

Updated by Tim Pierce over 9 years ago

  • Target version set to Bug Triage
Actions #3

Updated by Tom Clegg over 9 years ago

  • Tracker changed from Feature to Bug

If you use run-command, the only way to indicate success/failure is exit status. In both of these cases it looks like the script exits 0, run-command sets success=true on the task, and Crunch sets state=Complete. Crunch's part of this looks correct.

The script itself, however, incorrectly exit 0 after encountering errors. Fixing this could be as simple (or not simple) as using "set -e" and "set -o pipefail" in all the right places.

Aside 1: The run-command documentation could certainly be more forthcoming with advice about how to write scripts for it to use. (Currently exit codes are only mentioned in the context of the "ignore exit code" feature, which incidentally should probably be adjusted to explain what a terrible, terrible idea it is to use that feature.)

Aside 2: When you're at the point of giving run-command a shell script which in turn builds and runs another shell script, you're doing it wrong. At some point our docs failed you, by steering you toward using run-command for this instead of writing a Python program that calls one_task_per_input_file...

Actions #4

Updated by Tom Clegg over 9 years ago

  • Status changed from New to Feedback
Actions #5

Updated by Tom Clegg over 9 years ago

  • Subject changed from [Crunch] failed jobs are incorrectly reported as succeeding to [Documentation] Run-command docs should remind user how & why to exit non-zero on failure.
  • Category changed from Crunch to Documentation
Actions #6

Updated by Tom Clegg over 9 years ago

  • Tracker changed from Bug to Feature
  • Status changed from Feedback to New
  • Story points set to 0.5
Actions #7

Updated by Tom Clegg over 9 years ago

  • Target version changed from Bug Triage to Arvados Future Sprints
Actions #8

Updated by Ward Vandewege almost 3 years ago

  • Target version deleted (Arvados Future Sprints)
Actions #9

Updated by Peter Amstutz about 1 year ago

  • Release set to 60
Actions #10

Updated by Peter Amstutz about 2 months ago

  • Target version set to Future
Actions

Also available in: Atom PDF