Efficient live access to container logs » History » Version 1
Tom Clegg, 12/28/2022 07:32 PM
1 | 1 | Tom Clegg | h1. Efficient live access to container logs |
---|---|---|---|
2 | |||
3 | {{toc}} |
||
4 | |||
5 | h2. Goals |
||
6 | |||
7 | * Low latency: deliver live logs to clients (workbench/cli) as quickly as possible |
||
8 | * Scalable: don't generate significant load in RailsAPI |
||
9 | * Scalable: don't generate significant load in PostgreSQL |
||
10 | * Efficient: don't waste resources on event streams that nobody is listening to |
||
11 | * Robust: Don't let a badly behaved container (writing many gigabytes of logs) impede others |
||
12 | * Dependable: Retain complete logs (container/crunch-run/crunchstat) _in the log collection_ even in the event that crunch-run or the compute instance itself dies unexpectedly (_e.g.,_ bug, out of memory, hardware failure) |
||
13 | |||
14 | h2. Proposed architecture |
||
15 | |||
16 | * crunch-run offers a “container logs” API at its container gateway server |
||
17 | * controller offers a “container logs” API that works whether the container is finished or not |
||
18 | ** proxies to crunch-run _or_ keep-web (depends on whether the container is running) |
||
19 | * “container logs” API has two parts: |
||
20 | ** webdav: read file content from the log collection |
||
21 | ** server-sent-events stream: get notifications as logs appear (see below) |
||
22 | * dispatcher provides “plan B” to save logs when crunch-run crashes, instance dies, etc. |
||
23 | ** dispatcher routinely retrieves live logs from container gateway, and buffers content to local disk |
||
24 | ** dispatcher discards buffered logs when crunch-run commits to log collection |
||
25 | ** when crunch-run exits, dispatcher checks whether crunch-run managed to save the full log collection and, if needed, appends the latest buffered logs |
||
26 | * command-line client can “tail” and “tail -f” logs from a running container |
||
27 | * workbench2 uses a single code path to retrieve logs (instead of choosing webdav or logs table based on whether container is running) |
||
28 | |||
29 | h2. API details |
||
30 | |||
31 | @GET /arvados/v1/containers/{uuid}/log/stderr.txt@ |
||
32 | |||
33 | ** client will use Range headers to request partial content |
||
34 | |||
35 | @GET /arvados/v1/containers/{uuid}/log_events@ |
||
36 | |||
37 | ** response is a "server-sent events":https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events stream, data is JSON-encoded |
||
38 | ** a “file_sizes” event indicates the current size of new/updated files in the log collection |
||
39 | @event: file_sizes@ |
||
40 | @data: {"stderr.txt":1234,"stdout.txt":0,"crunch-run.txt":2345}@ |
||
41 | ** after the first message, unchanged files may be omitted: |
||
42 | @event: file_sizes@ |
||
43 | @data: {"stderr.txt":1444,"stdout.txt":72}@ |
||
44 | ** a “final” message indicates that the log collection is finalized and no further file_sizes events will be sent: |
||
45 | @event: final@ |
||
46 | ** a “retry” message indicates how long, in milliseconds, the client should wait before reconnecting after a disconnect (per SSE spec): |
||
47 | @retry: 2000@ |
||
48 | ** empty comments are sent periodically to keep the connection alive during idle periods |
||
49 | @:@ |
||
50 | ** after the “final” message, the connection either remains open and silent, or closes after specifying a long reconnect delay in a “retry” message (the idea is to avoid sending an inattentive client into a reconnect cycle - this doesn't affect well-behaved clients because they will hang up after receiving the “final” message) |
||
51 | ** @?maxInterval=X@ sets the maximum interval between messages (i.e., keepalive) to X seconds, default 15 |
||
52 | ** @?minInterval=X@ increases the minimum interval between messages (i.e., throttle) to X seconds, default 1 |
||
53 | |||
54 | h2. Authentication |
||
55 | |||
56 | |||
57 | The browser's EventSource API allows the caller to enable existing cached credentials/cookies, but not to explicitly set an Authorization header or another custom header. |
||
58 | * @?api_token=v2/...@ in the URL is easy to support on the client and server |
||
59 | ** this is not ideal: the token is likely to be logged by HTTP servers/proxies |
||
60 | ** we already accept this situation in our websocket usage |
||
61 | |||
62 | Server also accepts the usual Authorization header, and this should be used by clients that have the capability (e.g., arvados-client). |
||
63 | |||
64 | Future version may also support a “copy Authorization header into cookie” endpoint |
||
65 | ** client requests @GET /arvados/v1/cookie@ request with the token in the Authorization request header |
||
66 | ** response sets a cookie with the provided token |
||
67 | ** client sets up EventSource using @{withCredentials: true}@ |
||
68 | ** this approach is inherently flawed in that applications in different tabs (e.g., wb2.clusterA and wb2.clusterB, where A and B are in a federation) will unknowingly fight over which cookie should be sent to cluster A. |
||
69 | |||
70 | Future version may also support a “provide short-lived token” endpoint |
||
71 | ** client requests @GET /arvados/v1/api_client_authorizations/:uuid/slt@ with the real API token in an Authorization request header |
||
72 | ** response contains a new token that will only be recognized for the next 15 seconds |
||
73 | ** client sets up EventSource using @?slt={newtoken}@ |
||
74 | ** event stream sends a message with the real token in the “id” field |
||
75 | ** when reconnecting, the browser automatically sends the real token in the Last-Event-Id header |
||
76 | ** server uses the token in the Last-Event-Id field, if provided |
||
77 | |||
78 | h2. Client pseudocode |
||
79 | |||
80 | to reliably display logs for container with uuid C: |
||
81 | |||
82 | <pre> |
||
83 | connect to server-sent-events endpoint /arvados/v1/containers/{C}/log_events |
||
84 | on unexpected disconnect: |
||
85 | delay for duration indicated in last “retry” message |
||
86 | reconnect |
||
87 | on final: |
||
88 | disconnect |
||
89 | on file_sizes: |
||
90 | decode message data as json |
||
91 | compare each entry to last size received for corresponding file, if any |
||
92 | if desired, retrieve file data via webdav endpoint and present to user |
||
93 | </pre> |
||
94 | |||
95 | h2. Client (workbench2 / CLI) implementation notes |
||
96 | |||
97 | * optionally, if container state is complete/cancelled or CR state is final, client _may_ skip the log_events steps and just retrieve file data via webdav endpoint |
||
98 | * the client can rely on the server-sent-events stream to return a message, with sizes of all available log files, upon first connection, even if there are no available sizes yet (i.e., message.files may have zero keys) or the container is already finished |
||
99 | * message.files may have keys for newly added files that have not appeared in any previous message |
||
100 | * client should have logic to skip some content if the log file size grows faster than expected |
||
101 | |||
102 | h2. Crunch-run implementation notes |
||
103 | |||
104 | * buffer logs and periodically flush to keep (as before) |
||
105 | * container gateway has a new webdav endpoint that reads the (live) log collection |
||
106 | ** stdlib webdav server implements range requests for us |
||
107 | * container gateway has a new endpoint that returns a server-sent-events stream |
||
108 | ** upon connection, return one event per log file: [filename, current-size, last-saved-size] |
||
109 | ** when a file is updated, wait until the last returned event is more than 1s old, then return an event [filename, current-size, last-saved-size] |
||
110 | *** “current-size” enables client to construct a suitable range request to retrieve recent data |
||
111 | *** “current-size” enables client to skip ahead if appropriate |
||
112 | *** “last-saved-size” enables dispatcher to discard buffered data once it's safely written to storage and accessible via log collection |
||
113 | * crunch-run updates gateway_address as soon as the gateway service is up (instead of waiting until state=Running) |
||
114 | ** railsapi should allow gateway_address to be set when state=Locked |
||
115 | ** this allows clients to access early logs (e.g., while preparing the docker image) |
||
116 | |||
117 | h2. Controller implementation notes for log_events |
||
118 | |||
119 | * additional logic is needed here beyond a simple proxy, in order that a single client-to-controller connection can receive all log updates through the entire container lifecycle, including one with the final file sizes, even though |
||
120 | ** crunch-run is not running yet when state=Queued |
||
121 | ** crunch-run may exit/disconnect before returning the final update (many race possibilities, including container finishing while controller is trying to reconnect) |
||
122 | ** additional logs may be added by dispatcher if crunch-run exits before finalizing the container |
||
123 | * if state=Queued or container gateway_address is empty, send an empty file_sizes message, then send keepalive messages while waiting/polling for the gateway to come up |
||
124 | * if state=Complete or Cancelled, get the final log file sizes, send a file_sizes message, a long retry message, and a “final” message; delay a few seconds; and hang up |
||
125 | * connect to crunch-run's log_events endpoint (through controller tunnel if applicable), propagate client's last-id and min/maxInterval params, and relay messages (including keepalive) to client |
||
126 | * when crunch-run disconnects (or does not answer), wait a second, re-fetch the container record, and repeat from top |
||
127 | |||
128 | h2. Controller implementation notes for log data |
||
129 | |||
130 | * if container log is not empty, a specific range is requested, and the size of the requested file is compatible with the requested range, proxy to keep-web |
||
131 | * if state=Complete/Cancelled, proxy to keep-web |
||
132 | * if gateway_address is empty, return 404 |
||
133 | * if state=Locked/Running and gateway_address is not empty, proxy to gateway_address |
||
134 | |||
135 | h2. Dispatcher implementation (for retaining the last log entries when a worker instance crashes) |
||
136 | |||
137 | * routinely connect/reconnect to crunch-run gateway server of containers that are being supervised |
||
138 | * when unflushed logs appear, save them in a temporary buffer on local disk, up to a configurable size limit |
||
139 | * when crunch-run exits unexpectedly, check the latest saved log collection, and (if needed) append the buffered logs to the respective truncated files before finalizing the container |
||
140 | * in HPC environment where dispatcher→worker connections are not supported, connect through the worker→controller tunnel (see #19166) |