Efficient live access to container logs

See #16442 and related issues for implementation history.


  • Low latency: deliver live logs to clients (workbench/cli) as quickly as possible
  • Scalable: don't generate significant load in RailsAPI
  • Scalable: don't generate significant load in PostgreSQL
  • Efficient: don't waste resources on event streams that nobody is listening to
  • Robust: Don't let a badly behaved container (writing many gigabytes of logs) impede others
  • Dependable: Retain complete logs (container/crunch-run/crunchstat) in the log collection even in the event that crunch-run or the compute instance itself dies unexpectedly (e.g., bug, out of memory, hardware failure)

Proposed architecture

  • crunch-run offers a “container logs” API at its container gateway server
  • controller offers a “container logs” API that works whether the container is finished or not
    • proxies to crunch-run or keep-web (depends on whether the container is running)
  • “container logs” API has two parts:
    • webdav: read file content from the log collection
    • server-sent-events stream: get notifications as logs appear (see below)
  • dispatcher provides “plan B” to save logs when crunch-run crashes, instance dies, etc.
    • dispatcher routinely retrieves live logs from container gateway, and buffers content to local disk
    • dispatcher discards buffered logs when crunch-run commits to log collection
    • when crunch-run exits, dispatcher checks whether crunch-run managed to save the full log collection and, if needed, appends the latest buffered logs
  • command-line client can “tail” and “tail -f” logs from a running container
  • workbench2 uses a single code path to retrieve logs (instead of choosing webdav or logs table based on whether container is running)

API details

GET /arvados/v1/container_requests/{uuid}/log/{container_uuid}/stderr.txt

  • client will use Range headers to request partial content
  • client will add the {container_uuid}/ part explicitly using the container_uuid field from the container request record -- /arvados/v1/container_requests/{uuid}/log/ is a valid WebDAV endpoint but the client cannot rely on that directory listing to return the {container_uuid}/ subdirectory

GET /arvados/v1/container_requests/{uuid}/log_events

  • response is a server-sent events stream, data is JSON-encoded
  • a “file_sizes” event indicates the current size of new/updated files in the log collection
    event: file_sizes
    data: {"{container_uuid}/stderr.txt":1234,"{container_uuid}/stdout.txt":0,"{container_uuid}/crunch-run.txt":2345}
  • after the first message, unchanged files may be omitted:
    event: file_sizes
    data: {"{container_uuid}/stderr.txt":1444,"{container_uuid}/stdout.txt":72}
  • a “final” message indicates that the log collection is finalized and no further file_sizes events will be sent:
    event: final
  • a “retry” message indicates how long, in milliseconds, the client should wait before reconnecting after a disconnect (per SSE spec):
    retry: 2000
  • empty comments are sent periodically to keep the connection alive during idle periods
  • after the “final” message, the connection either remains open and silent, or closes after specifying a long reconnect delay in a “retry” message (the idea is to avoid sending an inattentive client into a reconnect cycle - this doesn't affect well-behaved clients because they will hang up after receiving the “final” message)
  • ?maxInterval=X sets the maximum interval between messages (i.e., keepalive) to X seconds, default 15
  • ?minInterval=X increases the minimum interval between messages (i.e., throttle) to X seconds, default 1


The browser's EventSource API allows the caller to enable existing cached credentials/cookies, but not to explicitly set an Authorization header or another custom header.
  • ?api_token=v2/... in the URL is easy to support on the client and server
    • this is not ideal: the token is likely to be logged by HTTP servers/proxies
    • we already accept this situation in our websocket usage

Server also accepts the usual Authorization header, and this should be used by clients that have the capability (e.g., arvados-client).

Future version may also support a “copy Authorization header into cookie” endpoint
  • client requests GET /arvados/v1/cookie request with the token in the Authorization request header
  • response sets a cookie with the provided token
  • client sets up EventSource using {withCredentials: true}
  • this approach is inherently flawed in that applications in different tabs (e.g., wb2.clusterA and wb2.clusterB, where A and B are in a federation) will unknowingly fight over which cookie should be sent to cluster A.
Future version may also support a “provide short-lived token” endpoint
  • client requests GET /arvados/v1/api_client_authorizations/:uuid/slt with the real API token in an Authorization request header
  • response contains a new token that will only be recognized for the next 15 seconds
  • client sets up EventSource using ?slt={newtoken}
  • event stream sends a message with the real token in the “id” field
  • when reconnecting, the browser automatically sends the real token in the Last-Event-Id header
  • server uses the token in the Last-Event-Id field, if provided

Client pseudocode

to reliably display logs for container with uuid C:

connect to server-sent-events endpoint /arvados/v1/container_requests/{CR}/log_events
  on unexpected disconnect:
    delay for duration indicated in last “retry” message
  on final:
  on file_sizes:
    decode message data as json
    compare each entry to last size received for corresponding file, if any
    if desired, retrieve file data via webdav endpoint and present to user

Client (workbench2 / CLI) implementation notes

  • optionally, if container state is complete/cancelled or CR state is final, client may skip the log_events steps and just retrieve file data via webdav endpoint
  • the client can rely on the server-sent-events stream to return a message, with sizes of all available log files, upon first connection, even if there are no available sizes yet (i.e., message.files may have zero keys) or the container is already finished
  • message.files may have keys for newly added files that have not appeared in any previous message
  • client should have logic to skip some content if the log file size grows faster than expected

Crunch-run implementation notes

  • buffer logs and periodically flush to keep (as before)
  • container gateway has a new webdav endpoint that reads the (live) log collection
    • stdlib webdav server implements range requests for us
  • container gateway has a new endpoint that returns a server-sent-events stream
    • upon connection, return one event per log file: [filename, current-size, last-saved-size]
    • when a file is updated, wait until the last returned event is more than 1s old, then return an event [filename, current-size, last-saved-size]
      • “current-size” enables client to construct a suitable range request to retrieve recent data
      • “current-size” enables client to skip ahead if appropriate
      • “last-saved-size” enables dispatcher to discard buffered data once it's safely written to storage and accessible via log collection
  • crunch-run updates gateway_address as soon as the gateway service is up (instead of waiting until state=Running)
    • railsapi should allow gateway_address to be set when state=Locked
    • this allows clients to access early logs (e.g., while preparing the docker image)

Controller implementation notes for log_events

  • additional logic is needed here beyond a simple proxy, in order that a single client-to-controller connection can receive all log updates through the entire container lifecycle, including one with the final file sizes, even though
    • crunch-run is not running yet when state=Queued
    • crunch-run may exit/disconnect before returning the final update (many race possibilities, including container finishing while controller is trying to reconnect)
    • additional logs may be added by dispatcher if crunch-run exits before finalizing the container
  • if state=Queued or container gateway_address is empty, send an empty file_sizes message, then send keepalive messages while waiting/polling for the gateway to come up
  • if state=Complete or Cancelled, get the final log file sizes, send a file_sizes message, a long retry message, and a “final” message; delay a few seconds; and hang up
  • connect to crunch-run's log_events endpoint (through controller tunnel if applicable), propagate client's last-id and min/maxInterval params, and relay messages (including keepalive) to client
  • when crunch-run disconnects (or does not answer), wait a second, re-fetch the container record, and repeat from top

Controller implementation notes for log data

  • if container log is not empty, a specific range is requested, and the size of the requested file is compatible with the requested range, proxy to keep-web
  • if state=Complete/Cancelled, proxy to keep-web
  • if gateway_address is empty, return 404
  • if state=Locked/Running and gateway_address is not empty, proxy to gateway_address

Dispatcher implementation (for retaining the last log entries when a worker instance crashes)

  • routinely connect/reconnect to crunch-run gateway server of containers that are being supervised
  • when unflushed logs appear, save them in a temporary buffer on local disk, up to a configurable size limit
  • when crunch-run exits unexpectedly, check the latest saved log collection, and (if needed) append the buffered logs to the respective truncated files before finalizing the container
  • in HPC environment where dispatcher→worker connections are not supported, connect through the worker→controller tunnel (see #19166)

Updated by Tom Clegg 11 months ago · 4 revisions