Project

General

Profile

Containers API » History » Version 14

Tom Clegg, 06/09/2015 08:32 PM

1 9 Tom Clegg
{{>TOC}}
2
3 2 Tom Clegg
h1. Jobs API (DRAFT)
4 1 Tom Clegg
5 11 Tom Clegg
A Job is a record of a computational process.
6
* Its goal is to capture, unambiguously, as much information as possible about the environment in which the process was run. For example, git trees, data collections, and docker images are stored as content addresses. This makes it possible to reason about the difference between two processes, and to replay a process at a different time and place.
7
* Clients can read Job records, but only the system can create or modify them.
8 1 Tom Clegg
9 11 Tom Clegg
A JobRequest is a client's expression of interest in knowing the outcome of a computational process.
10
* Typically, in this context the client's description of the process is less precise than a Job: a JobRequest describes job _constraints_ which can have different interpretations over time. For example, a JobRequest with a @{"kind":"git_tree","commit_range":"abc123..master",...}@ mount might be satisfiable by any of several different source trees, and this set of satisfying source trees can change when the repository's "master" branch is updated.
11
* The system is responsible for finding suitable Jobs and assigning them to JobRequests. (Currently this is expected to be done synchronously during the jobRequests.create and jobRequests.update transactions.)
12
* A JobRequest may indicate that it can _only_ be satisfied by a new Job record (i.e., existing results should not be reused). In this case creating a JobRequest amounts to a submission to the job queue. This is appropriate when the purpose of the JobRequest is to test whether a process is repeatable.
13
* A JobRequest may indicate that it _cannot_ be satisfied by a new Job record. This is an appropriate way to test whether a result is already available.
14 6 Tom Clegg
15 11 Tom Clegg
When the system has assigned a Job to a JobRequest, anyone with permission to read the JobRequest also has permission to read the Job.
16 1 Tom Clegg
17 9 Tom Clegg
h2. Use cases
18
19
h3. Preview
20
21
Tell me how you would satisfy job request X. Which pdh/commits would be used? Is the satisfying job already started? finished?
22
23
h3. Submit a previewed existing job
24
25
I'm happy with the already-running/finished job you showed me in "preview". Give me access to that job, its logs, and [when it finishes] its output.
26
27
h3. Submit a previewed new job
28
29
I'm happy with the new job the "preview" response proposed to run. Run that job.
30
31
h3. Submit a new job (disable reuse)
32
33
I don't want to use an already-running/finished job. Run a new job that satisfies my job request.
34
35
h3. Submit a new duplicate job (disable reuse)
36
37
I'm happy with the already-running/finished job you showed me in "preview". Run a new job exactly like that one.
38
39
h3. Select a job and associate it with my JobRequest
40
41
I'm not happy with the job you chose, but I know of another job that satisfies my request. Assuming I'm right about that, attach my JobRequest to the existing job of my choice.
42
43
h3. Just do the right thing without a preview
44
45
Satisfy job request X one way or another, and tell me the resulting job's UUID.
46
47 6 Tom Clegg
h2. JobRequest/Job life cycle
48
49
Illustrating job re-use and preview facility:
50
# Client CA creates a JobRequest JRA with priority=0.
51
# Server creates job JX and assigns JX to JRA, but does not try to run JX yet because max(priority)=0.
52
# Client CA presents JX to the user. "We haven't computed this result yet, so we'll have to run a new job. Is this OK?"
53
# Client CB creates a JobRequest JRB with priority=1.
54 1 Tom Clegg
# Server assigns JX to JRB and puts JX in the execution queue with priority=1.
55 6 Tom Clegg
# Client CA updates JRA with priority=2.
56
# Server updates JX with priority=2.
57
# Job JX starts.
58 11 Tom Clegg
# Client CA updates JRA with priority=0. (This is as close as we get to a "cancel" operation.)
59 6 Tom Clegg
# Server updates JX with priority=1. (JRB still wants this job to complete.)
60
# Job JX finishes.
61 1 Tom Clegg
# Clients CA and CB have permission to read JX (ever since JX was assigned to their respective JobRequests) as well as its progress indicators, output, and log.
62 5 Tom Clegg
63 1 Tom Clegg
h2. "JobRequest" schema
64
65
|Attribute|Type|Description|Discussion|Examples|
66 6 Tom Clegg
|uuid, owner_uuid, modified_by_client_uuid,  modified_by_user_uuid|string|Usual Arvados model attributes|||
67
|
68
|created_at, modified_at|datetime|Usual Arvados model attributes|||
69
|
70 1 Tom Clegg
|name|string|Unparsed|||
71
|
72
|description|text|Unparsed|||
73 6 Tom Clegg
|
74 11 Tom Clegg
|state|string|Once a request is committed, priority is the only attribute that can be modified.||@"Uncommitted"@
75
@"Committed"@|
76
|
77 12 Tom Clegg
|requesting_job_uuid|string|When the referenced job exits, the job request is automatically cancelled.|||
78
|
79 14 Tom Clegg
|job_uuid|uuid|The job that satisfies this job request.|See "methods" below.||
80 8 Tom Clegg
|
81
|mounts|hash|Objects to attach to the container's filesystem and stdin/stdout.
82
Keys starting with a forward slash indicate objects mounted in the container's filesystem.
83
Other keys are given special meanings here.|
84
We use "stdin" instead of "/dev/stdin" because literally replacing /dev/stdin with a file would have a confusing effect on many unix programs. The stdin feature only affects the standard input of the first process started in the container; after that, the usual rules apply.|
85 1 Tom Clegg
<pre>{
86
 "/input/foo":{
87
  "kind":"collection",
88
  "portable_data_hash":"d41d8cd98f00b204e9800998ecf8427e+0"
89
 },
90 9 Tom Clegg
 "stdin":{
91 1 Tom Clegg
  "kind":"collection_file",
92
  "uuid":"zzzzz-4zz18-yyyyyyyyyyyyyyy",
93
  "path":"/foo.txt"
94 8 Tom Clegg
 },
95 9 Tom Clegg
 "stdout":{
96 1 Tom Clegg
  "kind":"regular_file",
97
  "path":"/tmp/a.out"
98
 }
99 9 Tom Clegg
}</pre>|
100 1 Tom Clegg
|
101 11 Tom Clegg
|runtime_constraints|hash|Restrict the job's access to compute resources and the outside world (in addition to its explicitly stated inputs and output).
102 14 Tom Clegg
-- Each key is the name of a capability, like "internet" or "API" or "clock". The corresponding value is @true@ (the capability must be available in the job's runtime environment) or @false@ (must not) or a value or an array of two numbers indicating an inclusive range. If a key is omitted, availability of the corresponding capability is acceptable but not necessary.|This is a generalized version of "enforce purity restrictions": it is not a claim that the job will be pure. Rather, it helps us control and track runtime restrictions, which can be helpful when reasoning about whether a given job was pure.
103
-- In the most basic implementation, no capabilities are defined, and the only acceptable value of this attribute is the empty hash.
104 10 Tom Clegg
(TC)Should this structure be extensible like mounts?|
105
<pre>
106 1 Tom Clegg
{
107 10 Tom Clegg
  "ram":12000000000,
108
  "vcpus":[1,null]
109
}</pre>|
110
|
111 1 Tom Clegg
|container_image|string|Docker image repository and tag, docker image hash, collection UUID, or collection PDH.|||
112
|
113 14 Tom Clegg
|environment|hash|environment variables and values that should be set in the container environment (@docker run --env@). This augments and (when conflicts exists) overrides environment variables given in the image's Dockerfile.|||
114 1 Tom Clegg
|
115 10 Tom Clegg
|cwd|string|initial working directory, given as an absolute path (in the container) or a path relative to the WORKDIR given in the image's Dockerfile. The default is @"."@.||<pre>"/tmp"</pre>|
116 1 Tom Clegg
|
117
|command|array of strings|Command to execute in the container. Default is the CMD given in the image's Dockerfile.|
118 8 Tom Clegg
To use a UNIX pipeline, like "echo foo &#124; tr f b", or to interpolate environment variables, make sure your container image has a shell, and use a command like @["sh","-c","echo $PATH &#124; wc"]@.||
119 10 Tom Clegg
|
120 1 Tom Clegg
|output_path|string|Path to a directory or file inside the container that should be preserved as job's output when it finishes.|This path _must_ be, or be inside, one of the mount targets.
121
For best performance, point output_path to a writable collection mount.||
122
|
123
|priority|number|Higher number means spend more resources (e.g., go ahead of other queued jobs, bring up more nodes).
124 14 Tom Clegg
-- Zero means a job should not be run on behalf of this request. (Clients are expected to submit JobRequests with zero priority in order to prevew the job that will be used to satisfy it.)
125
-- Priority is null if and only if @state="Uncommitted"@.||
126 11 Tom Clegg
null
127
@0@
128
@1000.5@
129
@-1@|
130 1 Tom Clegg
|
131 11 Tom Clegg
|expires_at|datetime|After this time, priority is considered to be zero. If the assigned job is running at that time, the job _may_ be cancelled to conserve resources.||
132
null
133
@2015-07-01T00:00:01Z@|
134
|
135
|filters|array|Additional constraints for satisfying the request, given in the same form as the @filters@ parameter accepted by the @jobs.list@ API.||
136
@["created_at","<","2015-07-01T00:00:01Z"]@|
137
|
138 1 Tom Clegg
139
h2. "Job" schema
140
141 9 Tom Clegg
|Attribute|Type|Description|Discussion|Examples|
142 1 Tom Clegg
|
143 9 Tom Clegg
|uuid, owner_uuid, created_at, modified_at, modified_by_client_uuid,  modified_by_user_uuid|string|Usual Arvados model attributes|||
144
|
145
|state|string|||
146
<pre>
147
"Queued"
148
"Running"
149
"Cancelled"
150 8 Tom Clegg
"Failed"
151 1 Tom Clegg
"Complete"
152
</pre>|
153
|
154
|started_at, finished_at, log||Same as current job|||
155
|
156
|environment|hash|Must be equal to a JobRequest's environment in order to satisfy the JobRequest.|(TC)We could offer a "resolve" process here like we do with mounts: e.g., hash values in the JobRequest environment could be resolved according to the given "kind". I propose we leave room for this feature but don't add it yet.||
157 8 Tom Clegg
|
158 9 Tom Clegg
|cwd, command, output_path|string|Must be equal to the corresponding values in a JobRequest in order to satisfy that JobRequest.|||
159
|
160
|mounts|hash|Must contain the same keys as the JobRequest being satisfied. Each value must be within the range of values described in the JobRequest _at the time the Job is assigned to the JobRequest._|||
161
|
162 14 Tom Clegg
|runtime_constraints|hash|Compute resources, and access to the outside world, that are/were available to the job.
163
-- Generally this will contain additional keys that are not present in any corresponding JobRequests: for example, even if no JobRequests specified constraints on the number of CPU cores, the number of cores actually used will be recorded here.|
164 11 Tom Clegg
Permission/access types will change over time and it may be hard/impossible to translate old types to new. Such cases may cause old Jobs to be inelegible for assignment to new JobRequests.
165 14 Tom Clegg
-- (TC)Is it permissible for this to gain keys over time? For example, a job scheduler might not be able to predict how many CPU cores will be available until the job starts.||
166 9 Tom Clegg
|
167
|output|string|Portable data hash of the output collection.|||
168
|
169
|-pure-|-boolean-|-The job's output is thought to be dependent solely on its inputs, i.e., it is expected to produce identical output if repeated.-|
170
We want a feature along these lines, but "pure" seems to be a conclusion we can come to after examining various facts -- rather than a property of an individual job execution event -- and it probably needs something more subtle than a boolean.||
171
|
172 10 Tom Clegg
|container_image|string|Portable data hash of a collection containing the docker image used to run the job.|(TC) *If* docker image hashes can be verified efficiently, we can use the native docker image hash here instead of a collection PDH.||
173 8 Tom Clegg
|
174
|progress|number|A number between 0.0 and 1.0 describing the fraction of work done.|
175 10 Tom Clegg
If a job submits jobs of its own, it should update its own progress as the child jobs progress/finish.||
176 8 Tom Clegg
|
177 10 Tom Clegg
|priority|number|Priority assigned by the system, taking into account the priorities of all associated JobRequests.|||
178 8 Tom Clegg
179
h2. Mount types
180
181
The "mounts" hash is the primary mechanism for adding data to the container at runtime (beyond what is already in the container image).
182
183
Each value of the "mounts" hash is itself a hash, whose "kind" key determines the handler used to attach data to the container.
184
185
|Mount type|@kind@|Expected keys|Description|Examples|Discussion|
186
|
187 9 Tom Clegg
|Arvados data collection|@collection@|
188
@"portable_data_hash"@ _or_ @"uuid"@ _may_ be provided. If not provided, a new collection will be created. This is useful when @"writable":true@ and the job's @output_path@ is (or is a subdirectory of) this mount target.
189
@"writable"@ may be provided with a @true@ or @false@ to indicate the path must (or must not) be writable. If not specified, the system can choose.
190
@"path"@ may be provided, and defaults to @"/"@.|
191
At job startup, the target path will have the same directory structure as the given path within the collection. Even if the files/directories are writable in the container, modifications will _not_ be saved back to the original collections when the job ends.|
192 8 Tom Clegg
<pre>
193
{
194
 "kind":"collection",
195 9 Tom Clegg
 "uuid":"...",
196
 "path":"/foo.txt"
197 8 Tom Clegg
}
198 1 Tom Clegg
199
{
200
 "kind":"collection",
201 8 Tom Clegg
 "uuid":"..."
202
}
203 13 Tom Clegg
</pre>||
204 8 Tom Clegg
|
205
|Git tree|@git_tree@|
206 9 Tom Clegg
One of {@"git-url"@, @"repository_name"@, @"uuid"@} must be provided.
207
One of {@"commit"@, @"revisions"@} must be provided.
208 8 Tom Clegg
"path" may be provided. The default path is "/".|
209 1 Tom Clegg
At job startup, the target path will have the source tree indicated by the given revision. The @.git@ metadata directory _will not_ be available: typically the system will use @git-archive@ rather than @git-checkout@ to prepare the target directory.
210 14 Tom Clegg
-- If a value is given for @"revisions"@, it will be resolved to a set of commits (as desribed in the "ranges" section of git-revisions(1)) and the job request will be satisfiable by any commit in that set.
211
-- If a value is given for @"commit"@, it will be resolved to a single commit, and the tree resulting from that commit will be used.
212
-- @"path"@ can be used to select a subdirectory or a single file from the tree indicated by the selected commit.
213
-- Multiple commits can resolve to the same tree: for example, the file/directory given in @"path"@ might not have changed between commits A and B.
214
-- The resolved mount (found in the Job record) will have only the "kind" key and a "blob" or "tree" key indicating the 40-character hash of the git tree/blob used.|
215 8 Tom Clegg
<pre>
216
{
217 1 Tom Clegg
 "kind":"git_tree",
218 8 Tom Clegg
 "uuid":"zzzzz-s0uqq-xxxxxxxxxxxxxxx",
219
 "commit":"master"
220
}
221
222 1 Tom Clegg
{
223 9 Tom Clegg
 "kind":"git_tree",
224 5 Tom Clegg
 "uuid":"zzzzz-s0uqq-xxxxxxxxxxxxxxx",
225 1 Tom Clegg
 "commit_range":"bugfix^..master",
226 8 Tom Clegg
 "path":"/crunch_scripts/grep"
227 5 Tom Clegg
}
228 13 Tom Clegg
</pre>||
229 5 Tom Clegg
|
230 8 Tom Clegg
|Temporary directory|@tmp@|
231
@"capacity"@: capacity (in bytes) of the storage device|
232
At job startup, the target path will be empty. When the job finishes, the content will be discarded. This will be backed by a memory-based filesystem where possible.|
233
<pre>
234 1 Tom Clegg
{
235 8 Tom Clegg
 "kind":"tmp",
236 11 Tom Clegg
 "capacity":10000000000
237 8 Tom Clegg
}
238 1 Tom Clegg
</pre>||
239 12 Tom Clegg
|
240
|Keep|@keep@|
241 13 Tom Clegg
Expose all readable collections via arv-mount.|Requires suitable runtime constraints.|
242 12 Tom Clegg
<pre>
243
{
244
 "kind":"keep"
245
}
246 13 Tom Clegg
</pre>||
247 2 Tom Clegg
|
248 8 Tom Clegg
249 1 Tom Clegg
250
h2. Permissions
251
252
Users own JobRequests but the system owns Jobs.  Users get permission to read Jobs by virtue of linked JobRequests.
253
254
h2. API methods
255
256
Changes from the usual REST APIs:
257
258
h3. arvados.v1.job_requests.create and .update
259
260
These methods can fail when objects referenced in the "mounts" hash do not exist, or the acting user has insufficient permission on them.
261 8 Tom Clegg
262 11 Tom Clegg
If @state="Uncommitted"@:
263
* has null @priority@.
264
* can have its @job_uuid@ reset to null by a client.
265
* can have its @job_uuid@ set to a non-null value by a system process.
266 2 Tom Clegg
267 11 Tom Clegg
If @state="Committed"@:
268
* has non-null @priority@.
269
* cannot be modified, except that its @priority@ can be changed to another non-null value.
270 8 Tom Clegg
271 11 Tom Clegg
h3. arvados.v1.job_requests.cancel
272 8 Tom Clegg
273 11 Tom Clegg
Set @priority@ to zero.
274 8 Tom Clegg
275 11 Tom Clegg
h3. arvados.v1.job_requests.satisfy
276
277
Find or create a suitable job, and update @job_uuid@.
278
279
Return an error if @job_uuid@ is not null.
280
281
Q: Can this be requested during create? Create+satisfy is a common operation so having a way to do it in a single API call might be a worthwhile convenience.
282
283
Q: Better name?
284
285 1 Tom Clegg
h3. arvados.v1.jobs.create and .update
286
287
These methods are not callable except by system processes.
288
289
h3. arvados.v1.jobs.progress
290
291
This method permits specific types of updates while a job is running: update progress, record success/failure.
292
293
Q: [How] can a client submitting job B indicate it shouldn't run unless/until job A succeeds?
294
295
h2. Debugging
296
297
Q: Need any infrastructure debug-logging controls in this API?
298
299
Q: Need any job debug-logging controls in this API? Or just use environment vars?
300
301
h2. Scheduling and running jobs
302
303
Q: When/how should we implement a hooks for futures/promises: e.g., "run job Y when jobs X0, X1, and X2 have finished"?
304 11 Tom Clegg
305
h2. Accounting
306
307
A complete design for resource accounting and quota is out of scope here, but we do assert here that the API makes it feasible to retain accounting data.
308
309
It should be possible to retrieve, for a given job, a complete set of resource allocation intervals, each one including:
310
* interval start time
311
* interval end time (presented as null or now if the interval hasn't ended yet)
312
* user uuid
313
* job request id
314
* job request priority
315
* job state