Project

General

Profile

Crunch runner » History » Version 5

Tom Clegg, 12/23/2015 09:54 PM

1 1 Peter Amstutz
h1. Crunch runner
2
3 5 Tom Clegg
Note: This is a Crunch1 artifact, not to be confused with [[Crunch2 run]] and the [[Containers API]].
4
5 1 Peter Amstutz
Crunch runner is a Go program designed to be injected into a Docker container used to bootstrap running some other command line program, upload the results, and communicate task success or failure to the API server.  It is similar to the Python crunch script run-command, but because it is a compiled Go binary, it has a much lighter footprint than run-command (which requires the Arvados Python SDK and all its dependencies) and so can run in a wider variety of container environments.
6 2 Peter Amstutz
7
Example job invocation:
8
9
<pre>
10
{
11
  "script": "crunchrunner",
12
  "script_parameters": {
13
    "tasks": [
14
       {
15 3 Peter Amstutz
          "command": ["cat", "$(task.keep)/d3b07384d113edec49eaa6238ad5ff00+4/input1.txt", "-", "input3.txt"],
16
          "task.stdin": "$(task.keep)/d3b07384d113edec49eaa6238ad5ff00+4/input2.txt",
17
          "task.stdout": "output.txt",
18
          "task.env": {
19
            "BARFOO": "foobar"
20
          },
21
          "task.vwd": {
22
             "input3.txt": "$(task.keep)/d3b07384d113edec49eaa6238ad5ff00+4/input99.txt"
23
          },
24
          "task.successCodes": [0],
25
          "task.temporaryFailCodes": [1, 2],
26
          "task.permanentFailCodes": [3]
27 2 Peter Amstutz
       }
28
     ]
29
   }
30
}
31 1 Peter Amstutz
</pre>
32
33 3 Peter Amstutz
If there is a single task in "tasks", it runs in task 0.  If there are multiple tasks, they are scheduled as job_tasks.
34
35
* "command" is the command line to execute
36
* "task.stdin" is a path to a file that will be attached to standard input
37
* "task.stdout" is a path to a file that will be attached to standard output.  Must be a relative path in the output directory.  Subdirectories are permitted.
38
* "task.env" allows setting environment variables.
39
* "task.vwd" is a list of files to be symbolically linked into the output directory.
40
* "task.successCodes" is a list of exit codes that are considered success.
41
* "task.temporaryFailCodes" is a list of exit codes that are temporary failure (can be retried) .
42
* "task.permanentFailCodes" is a list of exit codes that are permanent failure (cannot be retried.)
43
44
Everything except "command" is optional.
45
46 4 Peter Amstutz
The initial working directory of the command is the output directory.  The umask is set to 0022.
47 3 Peter Amstutz
48
If successCodes/temporaryFailCodes/permanentFailCodes are not specified, or the exit code isn't found in one of the arrays, default Unix semantics apply (zero success, nonzero fail).
49
50
There are three substitution parameters, $(task.tmpdir), $(task.outdir) and $(task.keep).  These resolve to their respective paths on the file system.  Substitution is applied to command line arguments, task.stdin, task.env values, and task.vwd values.