The "logs" table in the Postgres database currently serves three purposes:
- an audit log, permitting admins and users to look up the time and details of past changes to Arvados objects via arvados.v1.logs.* endpoints
- a mechanism for passing cache-invalidation events, used by the puma and Go websocket servers, the Python SDK "events" library, and arvados-cwl-runner to detect when an object has changed
- a staging area for stdout/stderr text coming from users' jobs/containers, permitting users to see what their jobs/containers are doing while they are still running (i.e., before those text files are written to Keep).
Long term plans:
- The cache-invalidation and live job/container-logging systems will not rely on the logs table at all. The puma websocket server will retire. The Go websocket server will use a more efficient event-passing system -- perhaps something like nsq.
- Recent job/container-logging data will be also accessible some other way -- perhaps a combination of a periodically-flushed log collection and an in-memory buffer.
- Audit logs will be completely optional; will use a better schema that supports search; will shard by time span or use some other approach to prevent unbounded growth; and will be separate from the Arvados object database itself.