Bug #20334
closedLink deduplication migration is extremely slow
Description
Hi:
We just did an update on [CLUSTER] (the testing instance), and the migration took ~32hours, but almost all of the time was spent on the `20221230155924_bigint_id.rb` migration, this one changes links and entries on materialized transactions, and does the usual temporary table construction / graph traversal and every link took 30s. The actual ID upgrade to 64 bits by my estimation took at most 2hours, probably much less.We are thinking on running the dedup before the update as a rake task to avoid having the production instances in a degraded state for several days, but perhaps you should change the update notes for the benefit of other users, and also optimize the graph traversal, as I think it is currently quite a execution hot spot.
I think there is a better way to do this, it looks like the migration could be wrapped in a "batch_update_permissions" block which would do a single full permission refresh at the end instead of many individual ones. I will check with Tom Clegg about it
Updated by Tom Clegg over 1 year ago
- Status changed from New to In Progress
20334-slow-dedup-migration @ 7ea0bf89df6f3400a1c96f8838a2a7a1e3706e48
Wraps the migration with batch_update_permissions
.
(It would be even smoother to run this migration using a new apiserver package on a different node while the previous version is still serving clients. Most migrations are safe to run that way. But with the above fix, the migration may be fast enough that it's not worth the trouble.)
Updated by Tom Clegg over 1 year ago
- % Done changed from 0 to 100
- Status changed from In Progress to Resolved
Applied in changeset arvados|30951519f83d2e874f0928bc22c08db2864d163c.