Feature #18863
closedadd background job to controller to clean up old log records
Description
As identified in https://dev.arvados.org/issues/18763#note-5, we have a "deleted_old_container_logs" rake task that is supposed to be running in a cron job to clear out old container logs.
Following the pattern we started using for the trash sweeps (#18339), add a background job that executes this sql query.
Remove the rake task file from the repository. Add a note to the upgrade nodes document that the cron job should be removed when upgrading.
The query used by the existing rake task is
DELETE FROM logs WHERE id in (SELECT logs.id FROM logs JOIN containers ON logs.object_uuid = containers.uuid WHERE event_type IN ('stdout', 'stderr', 'arv-mount', 'crunch-run', 'crunchstat') AND containers.log IS NOT NULL AND now() - containers.finished_at > interval '#{Rails.configuration.Containers.Logging.MaxAge.to_i} seconds')"
Determined empirically, a far more efficient version of that (if postgresql specific) is
delete from logs using containers where logs.object_uuid=containers.uuid and logs.event_type in ('stdout', 'stderr', 'arv-mount', 'crunch-run', 'crunchstat') AND containers.log IS NOT NULL AND now() - containers.finished_at > interval '#{Rails.configuration.Containers.Logging.MaxAge.to_i} seconds')"
Don't forget to change the delete_old_container_logs.rake file from the API server to a no-op after the new background job is implemented, https://dev.arvados.org/issues/18763#note-5
There are separate lifetimes for container logs and general audit logs, ensure that both are honored.
Related issues
Updated by Ward Vandewege over 2 years ago
- Related to Bug #18763: remove unused rake tasks added
Updated by Peter Amstutz over 2 years ago
- Target version changed from 2022-03-30 Sprint to 2022-04-13 Sprint
Updated by Peter Amstutz over 2 years ago
- Related to Bug #18762: rails background tasks scaling issues added
Updated by Peter Amstutz over 2 years ago
- Target version changed from 2022-04-13 Sprint to 2022-04-27 Sprint
Updated by Peter Amstutz over 2 years ago
- Target version changed from 2022-04-27 Sprint to 2022-05-11 sprint
Updated by Peter Amstutz over 2 years ago
- Target version changed from 2022-05-11 sprint to 2022-05-25 sprint
Updated by Peter Amstutz over 2 years ago
- Target version changed from 2022-05-25 sprint to 2022-06-08 sprint
Updated by Peter Amstutz over 2 years ago
- Target version changed from 2022-06-08 sprint to 2022-06-22 Sprint
Updated by Peter Amstutz over 2 years ago
- Target version changed from 2022-06-22 Sprint to 2022-07-06
Updated by Peter Amstutz over 2 years ago
- Target version changed from 2022-07-06 to 2022-07-20
Updated by Peter Amstutz over 2 years ago
- Target version changed from 2022-07-20 to 2022-08-03 Sprint
Updated by Peter Amstutz over 2 years ago
- Target version changed from 2022-08-03 Sprint to 2022-08-17 sprint
Updated by Peter Amstutz over 2 years ago
- Target version changed from 2022-08-17 sprint to 2022-08-31 sprint
Updated by Peter Amstutz over 2 years ago
- Target version changed from 2022-08-31 sprint to 2022-09-14 sprint
Updated by Peter Amstutz about 2 years ago
- Target version changed from 2022-09-14 sprint to 2022-09-28 sprint
Updated by Peter Amstutz about 2 years ago
- Target version changed from 2022-09-28 sprint to 2022-10-12 sprint
Updated by Peter Amstutz about 2 years ago
- Target version changed from 2022-10-12 sprint to 2022-10-26 sprint
Updated by Peter Amstutz about 2 years ago
- Target version changed from 2022-10-26 sprint to 2022-10-12 sprint
Updated by Peter Amstutz about 2 years ago
- Target version changed from 2022-10-12 sprint to 2022-10-26 sprint
Updated by Peter Amstutz about 2 years ago
- Category set to API
- Subject changed from [controller] add background job to clean up old container log records to add background job to controller to clean up old container log records
Updated by Peter Amstutz about 2 years ago
- Description updated (diff)
- Subject changed from add background job to controller to clean up old container log records to add background job to controller to clean up old log records
Updated by Tom Clegg about 2 years ago
18863-delete-old-logs-without-cron @ b89ab7f9270acfabe9139d14d7071cf117b39bd4 -- developer-run-tests: #3324
- configurable "delete old logs" interval
- controller background task
- rake task is now a no-op (but still exists, so any existing cron jobs succeed harmlessly instead of failing and causing unnecessary sysadmin alerts)
Currently, old audit logs (e.g., "collection was updated") are automatically deleted by a background task in railsapi that's kicked off by an after_commit hook. Eventually we want to move that to Go as well, but I'm assuming that's out of scope here, because moving it wouldn't improve/simplify arvados installation.
I can't find any mention in the existing install docs about running the rake task as a cron job. Did we just neglect to document this at all?
Updated by Lucas Di Pentima about 2 years ago
Just one comment:
Only 5 different event_type
logs get cleaned up in this branch. While working on WB2's log viewer I noticed there're some additional ones, do you think the ones listed here also need sweeping? https://dev.arvados.org/projects/arvados/repository/arvados-workbench2/revisions/main/entry/src/store/process-logs-panel/process-logs-panel-actions.ts#L134
The rest LGTM, thanks.
Updated by Tom Clegg about 2 years ago
Good catch. Added the other event_type values.
18863-delete-old-logs-without-cron @ f3203e42412d5b5216b2c70caae47b73b712d18c -- developer-run-tests: #3327
Updated by Tom Clegg about 2 years ago
- % Done changed from 0 to 100
- Status changed from In Progress to Resolved
Applied in changeset arvados|0596129229750c593066e414d9315f643585bc3e.