Project

General

Profile

Feature #18863

Updated by Ward Vandewege about 2 years ago

As identified in https://dev.arvados.org/issues/18763#note-5, we have a "deleted_old_container_logs" rake task that is supposed to be running in a cron job to clear out old container logs. 

 Following the pattern we started using for the trash sweeps (#18339), add a background job that executes this sql query. 

 Remove the rake task file from the repository. Add a note to the upgrade nodes document that the cron job should be removed when upgrading. 

 The query used by the existing rake task is 

   DELETE FROM logs WHERE id in (SELECT logs.id FROM logs JOIN containers ON logs.object_uuid = containers.uuid WHERE event_type IN ('stdout', 'stderr', 'arv-mount', 'crunch-run', 'crunchstat') AND containers.log IS NOT NULL AND now() - containers.finished_at > interval '#{Rails.configuration.Containers.Logging.MaxAge.to_i} seconds')" 


 Determined empirically, a far aa more efficient version of that (if postgresql specific) is 

   delete from logs using containers where logs.object_uuid=containers.uuid and logs.event_type in ('stdout', 'stderr', 'arv-mount', 'crunch-run', 'crunchstat') AND containers.log IS NOT NULL AND now() - containers.finished_at > interval '#{Rails.configuration.Containers.Logging.MaxAge.to_i} seconds')" 


 Don't forget to remove the delete_old_container_logs.rake file from the API server after the new background job is implemented, cf. https://dev.arvados.org/issues/18763#note-5

Back