Project

General

Profile

API HistoricalForcasting data for CR » History » Version 6

Tom Clegg, 06/25/2020 03:55 PM

1 1 Nico César
h1. API Historical/Forcasting data for CR
2
3
Goal: create a pipeline forecaster and visualization for historical data. This should expose APIs that can be used in the ContainerRequest visualization and 
4
also could be use to provide extra information for the current running CR
5
6
Glossary:
7
8 3 Nico César
* Checkpoint: is a generic name that currently corresponds to a step name. This id together with "family" make a unique cluster to summarize results. This summarization for the unique cluster includes: a) several runs with similar parameters and b) scattered steps that have the pattern: name_2, name_3,..., name_229
9 1 Nico César
10 3 Nico César
* Family: A common name like "gatk" or "haplotypecaller" can be used as a step name. The family definition will help to separate the 2 populationsn terms of checkpoints. We think that implementing this based the parameters of CommandLineTool and parent workflow md5sum or a combination of both
11 1 Nico César
12 2 Nico César
* Datapoint: a concrete data that can be plotted as historical data. Currently we're bounding together the container request and the associated container to have a unified view of the times involved. This should not get confused with forecast data since can be used separately
13 1 Nico César
14
h2. API 
15
16 5 Nico César
The "checkpoints" endpoint is where the stadistics that will be used as forecasting. Right now as an example we'll start with time_* keys, but in the future this will expose all the data needed to do an acurate forecast.
17
18 1 Nico César
GET /container-request/aaaaa-xvhdp-123456789abc/checkpoints
19
20
Output:
21
22
<pre>
23
{
24
  "checkpoints": [
25
    {
26 3 Nico César
      "name": "merge-tilelib",
27
      "family": "family22",
28 1 Nico César
      "dependencies": [
29
        "createsglf"
30
      ],
31
      "time_average": 8254.534873,
32
      "time_count": 1,
33
      "time_min": 8254.534873,
34
      "time_min_comment": "duration:merge-tilelib#su92l-dz642-cc7799yfwi5jmd9",
35
      "time_max": 8254.534873,
36
      "time_max_comment": "duration:merge-tilelib#su92l-dz642-cc7799yfwi5jmd9"
37
    },
38
    {
39 3 Nico César
      "name": "createsglf",
40
      "family": "family9",
41 1 Nico César
      "dependencies": [],
42
      "time_average": 4741.290203,
43
      "time_count": 58,
44
      "time_min": 82.138309,
45
      "time_min_comment": "duration:createsglf_57#su92l-dz642-3u3g4bq1yh4pqje",
46
      "time_max": 5818.898387,
47
      "time_max_comment": "duration:createsglf_8#su92l-dz642-8d094xhqciin5m2"
48
    },
49
...
50
],
51
"time_average": <average time for the CR family>,
52
</pre>
53
54
55
GET /container-request/aaaaa-xvhdp-123456789abc/datapoints
56
57
Output:
58
59
<pre>
60
[
61
  {
62
    "step_name": "createsglf",
63
    "start_1": "2020-01-15 19:49:34.213 +0000",
64
    "end_1": "2020-01-15 21:19:39.001 +0000",
65
    "start_2": "2020-01-15 19:54:44.864 +0000",
66
    "end_2": "2020-01-15 21:19:39.001 +0000",
67
    "reuse": false,
68
    "status": "completed",
69
    "legend": "<p>createsglf</p><p>Container Request: <a href=\"https://workbench.su92l.arvadosapi.com/container_requests/su92l-xvhdp-zfc3ffxk3slmkzv\">su92l-xvhdp-zfc3ffxk3slmkzv</a></p><p>Container duration: 1h24m54.137122s\n</p>"
70
  },
71
  {
72
    "step_name": "createsglf_2",
73
    "start_1": "2020-01-15 19:49:34.288 +0000",
74
    "end_1": "2020-01-15 21:29:11.399 +0000",
75
    "start_2": "2020-01-15 19:54:51.275 +0000",
76
    "end_2": "2020-01-15 21:29:11.399 +0000",
77
    "reuse": false,
78
    "status": "completed",
79
    "legend": "<p>createsglf_2</p><p>Container Request: <a href=\"https://workbench.su92l.arvadosapi.com/container_requests/su92l-xvhdp-py99va9hnvuxzp5\">su92l-xvhdp-py99va9hnvuxzp5</a></p><p>Container duration: 1h34m20.123849s\n</p>"
80
  },
81
....
82
</pre>
83
84
GET /container-request/aaaaa-xvhdp-123456789abc/workflow-dot
85
86
Output:
87
88
<pre>
89
digraph cwlgraph {
90
rankdir=LR;
91
graph [compound=true];
92
93
subgraph cluster_0 {
94
label="#createcgf-wf.cwl";
95
node [style=filled];
96
shape=box
97
style="filled";
98
color="#dddddd";
99
"#createcgf-wf.cwl" [ label = "#createcgf-wf.cwl", style = invis ];
100
....
101
</pre>
102
103
h2. Frontend
104
105
Dot file can be rendered with  https://domparfitt.com/graphviz-react/ we already tested some big files 
106
107
h2. Schema and queries on the postgres DB 
108
109
TODO: Outline the transformation from the current local leveldb cache to some per-user caching table.  
110 6 Tom Clegg
TODO: list the queries to INSERT and SELECT the data for a particular checkpoint.
111 1 Nico César
112
h2. Permissions
113
114 6 Tom Clegg
One concern is permissions. we'll behave similar to everything else in Arvados: if it's a CR that the token doesn't have access to, then is a 404. This includes the idea of "sumarized data" as in the historical time and prices of the CRs
115
116
When forecasting a CR for a given user, we should only use data about containers that user can see. This has implications for caching:
117
* When responding to user A, can't reuse cached results that we generated for user B
118
* When using cached results, need to consider whether to recompute to reflect recent permission changes
119
120 3 Nico César
121
h2. Real World Example
122
123
Take the case of su92l-xvhdp-bs4tseq26te2bnz ( a hasher function that Ops usually use as smoke test)
124
125
h3. graph
126
127
!su92l-xvhdp-bs4tseq26te2bnz.png!
128
129
the dotty representation would be:
130
131
<pre>
132
digraph cwlgraph {
133
rankdir=LR;
134
graph [compound=true];
135
136
subgraph cluster_0 {
137
label="#main";
138
node [style=filled];
139
shape=box
140
style="filled";
141
color="#dddddd";
142
"#main" [ label = "#main", style = invis ];
143
144
145
"#main
146
inputfile" -> "step #main
147
hasher1";
148
"#main
149
hasher1_outputname" -> "step #main
150
hasher1";
151
"step #main
152
hasher1" -> "#main
153
hasher1
154
hasher_out";
155
"#main
156
hasher1
157
hasher_out" -> "step #main
158
hasher2";
159
"#main
160
hasher2_outputname" -> "step #main
161
hasher2";
162
"step #main
163
hasher2" -> "#main
164
hasher2
165
hasher_out";
166
"#main
167
hasher2
168
hasher_out" -> "step #main
169
hasher3";
170
"#main
171
hasher3_outputname" -> "step #main
172
hasher3";
173
"step #main
174
hasher3" -> "#main
175
hasher3
176
hasher_out";
177
}
178
179
180
181
"step #main
182
hasher1" [fillcolor="#FFD700", style="rounded,filled", shape=box];
183
"step #main
184
hasher2" [fillcolor="#FFD700", style="rounded,filled", shape=box];
185
"step #main
186
hasher3" [fillcolor="#FFD700", style="rounded,filled", shape=box];
187
"#hasher.cwl" [fillcolor="#FF9912", style="rounded,filled", shape=box];
188
189
190
"step #main
191
hasher1" -> "#hasher.cwl" [label="runs", style="dashed"];
192
"step #main
193
hasher2" -> "#hasher.cwl" [label="runs", style="dashed"];
194
"step #main
195
hasher3" -> "#hasher.cwl" [label="runs", style="dashed"];
196
}</pre>
197
198
h3. datapoints
199 1 Nico César
200 4 Nico César
<pre>
201 3 Nico César
[
202
  {
203
    "checkpoint": "hasher1",
204
    "start_1": "2020-05-12 16:35:33.594 +0000",
205
    "end_1": "2020-05-12 16:37:30.597 +0000",
206
    "start_2": "2020-05-12 16:37:27.893 +0000",
207
    "end_2": "2020-05-12 16:37:30.597 +0000",
208
    "reuse": false,
209
    "legend": "<p>hasher1</p><p>Container Request: <a href=\"https://workbench.su92l.arvadosapi.com/container_requests/su92l-xvhdp-pbpkli9qovdo4q8\">su92l-xvhdp-pbpkli9qovdo4q8</a></p><p>Container duration: 2.70491s\n</p>"
210
  },
211
  {
212
    "checkpoint": "hasher2",
213
    "start_1": "2020-05-12 16:37:33.673 +0000",
214
    "end_1": "2020-05-12 16:39:56.562 +0000",
215
    "start_2": "2020-05-12 16:39:51.455 +0000",
216
    "end_2": "2020-05-12 16:39:56.562 +0000",
217
    "reuse": false,
218
    "legend": "<p>hasher2</p><p>Container Request: <a href=\"https://workbench.su92l.arvadosapi.com/container_requests/su92l-xvhdp-l8je8tws556fqcp\">su92l-xvhdp-l8je8tws556fqcp</a></p><p>Container duration: 5.10645s\n</p>"
219
  },
220
  {
221
    "checkpoint": "hasher3",
222
    "start_1": "2020-05-12 16:39:57.608 +0000",
223
    "end_1": "2020-05-12 16:42:17.628 +0000",
224
    "start_2": "2020-05-12 16:42:14.836 +0000",
225
    "end_2": "2020-05-12 16:42:17.628 +0000",
226
    "reuse": false,
227
    "legend": "<p>hasher3</p><p>Container Request: <a href=\"https://workbench.su92l.arvadosapi.com/container_requests/su92l-xvhdp-jx5vk6lq26dsbba\">su92l-xvhdp-jx5vk6lq26dsbba</a></p><p>Container duration: 2.792018s\n</p>"
228
  }
229
]</pre>
230
231
232
h3. checkpoints
233
234
<pre>
235
{
236
  "checkpoints": [
237
    {
238
      "name": "hasher2",
239
      "family": "abde1234-9876543",
240
      "dependencies": [
241
        "hasher1"
242
      ],
243
      "time_average": 5.10645,
244
      "time_count": 1,
245
      "time_min": 5.10645,
246
      "time_min_comment": "duration:hasher2#su92l-dz642-eouma4xv1qpnhvc",
247
      "time_max": 5.10645,
248
      "time_max_comment": "duration:hasher2#su92l-dz642-eouma4xv1qpnhvc"
249
    },
250
    {
251
      "name": "hasher3",
252
      "family": "87654321-fedcba01",
253
      "dependencies": [
254
        "hasher2"
255
      ],
256
      "time_average": 2.792018,
257
      "time_count": 1,
258
      "time_min": 2.792018,
259
      "time_min_comment": "duration:hasher3#su92l-dz642-tn9t07438jd1zrt",
260
      "time_max": 2.792018,
261
      "time_max_comment": "duration:hasher3#su92l-dz642-tn9t07438jd1zrt"
262
    },
263
    {
264
      "name": "hasher1",
265
      "family": "deadbeef-deafbeef",
266
      "dependencies": [],
267
      "time_average": 2.70491,
268
      "time_count": 1,
269
      "time_min": 2.70491,
270
      "time_min_comment": "duration:hasher1#su92l-dz642-e6d8emz3ez54owu",
271
      "time_max": 2.70491,
272
      "time_max_comment": "duration:hasher1#su92l-dz642-e6d8emz3ez54owu"
273
    }
274
  ]
275
}
276
</pre>