Bug #18339

Updated by Peter Amstutz 3 months ago

Customer reported problem in which a large number of projects, which had been trashed two weeks previously, became eligible for deletion all at once, and the resulting sweep processes crushed the database.

# If the last sweep hasn't finished, currently it starts a new, overlapping sweep operation. Instead, it should probably do something like try to take a lock (perhaps by on a empty, special-purpose table), and then if it can't acquire the lock, don't do anything (it checks the sweep task periodically, so it'll eventually run again when lock is no longer held).
# The default value for TrashSweepInterval is 60s. Deleting trashed projects is a very low priority process, so the default interval could be much lower (eg 5m or 15m or 60m)
# The
delete operations should run in a transaction, to reduce overhead from auto-commit. Although, if it takes any locks they will be held for the duration of the transaction.
# Destroying a group also destroys any permission links pointing to it. Ruby on Rails is running the before_destroy hook for the permission link, which does an update_permissions operation, which takes the table lock on materialized_permissions. We can set @Thread.current[:suppress_update_permissions] = true@ to prevent permissions from being recomputed, we just need to be confident that direct updates to the permission table are correct.

side Q: how does should trash interact with 'role' groups? It appears a trashed 'role' group continues to be traversed for permissions until it is actually deleted from the system. Consider whether trashing a role group should either (a) not be allowed if there are outgoing permissions, (b) the operation of putting it in the trash should also delete all the outgoing permissions, or (c) the concept of trash doesn't apply to the role group, and it is always deleted immediately.