Project

General

Profile

Feature #18863

Updated by Peter Amstutz over 1 year ago

As identified in https://dev.arvados.org/issues/18763#note-5, we have a "deleted_old_container_logs" rake task that is supposed to be running in a cron job to clear out old container logs. 

 Following the pattern we started using for the trash sweeps (#18339), add a background job that executes this sql query. 

 Remove the rake task file from the repository. Add a note to the upgrade nodes document that the cron job should be removed when upgrading. 

 The query used by the existing rake task is 

   DELETE FROM logs WHERE id in (SELECT logs.id FROM logs JOIN containers ON logs.object_uuid = containers.uuid WHERE event_type IN ('stdout', 'stderr', 'arv-mount', 'crunch-run', 'crunchstat') AND containers.log IS NOT NULL AND now() - containers.finished_at > interval '#{Rails.configuration.Containers.Logging.MaxAge.to_i} seconds')" 


 Determined empirically, a far more efficient version of that (if postgresql specific) is 

   delete from logs using containers where logs.object_uuid=containers.uuid and logs.event_type in ('stdout', 'stderr', 'arv-mount', 'crunch-run', 'crunchstat') AND containers.log IS NOT NULL AND now() - containers.finished_at > interval '#{Rails.configuration.Containers.Logging.MaxAge.to_i} seconds')" 


 Don't forget to change remove the delete_old_container_logs.rake file from the API server to a no-op after the new background job is implemented, cf. https://dev.arvados.org/issues/18763#note-5 

 There are separate lifetimes for container logs and general audit logs, ensure that both are honored. 

Back