Bug #13427
closed[keep-balance] Handle volumes that are mounted simultaneously by multiple servers
Description
- keep0 mounts vol0 (rw), vol1 (ro)
- keep1 mounts vol1 (rw), vol0 (ro)
- keep2 mounts vol2 (rw), vol3 (ro)
- keep3 mounts vol3 (rw), vol2 (ro)
This setup is desirable when each block appears on only one backend volume, i.e., when the desired replication level is already provided by the backend. When a single keep server goes down, all blocks are still readable.
However, with this setup, the current keep-balance implementation will never move a block to a better rendezvous position. It sees N readonly replicas and figures there's no point making more copies on different servers: it won't be able to delete the readonly replicas, so making more replicas will result in permanent overreplication. If it pays attention to the device IDs reported by the servers, it could understand that the readonly replicas are just different views of writable replicas it sees elsewhere, and ignore them.
Updated by Tom Clegg over 6 years ago
- In (*Balancer)Run(), de-duplicate devices after calling discoverMounts on all services. If the same device ID is reported by both read-only and read/write mounts, drop the read-only mounts entirely.
- In (*Balancer)balanceBlock(), track which devices are going to be used ("wantDev") and treat this like wantMnt: don't try to use the same device twice. (When a device is mounted by multiple servers, we should prefer the one in best rendezvous position, which depends on the block -- so we can't de-duplicate these ahead of time.)
Updated by Tom Morris over 6 years ago
- Target version changed from To Be Groomed to Arvados Future Sprints
Updated by Tom Morris over 6 years ago
- Target version changed from Arvados Future Sprints to 2018-06-06 Sprint
Updated by Tom Clegg over 6 years ago
- Status changed from New to In Progress
- Target version changed from 2018-06-06 Sprint to 2018-06-20 Sprint
Updated by Tom Clegg over 6 years ago
13427-multiple-mounts @ c9143544609d90da33eb3c2d566fc5d6a25188b2
Updated by Tom Clegg over 6 years ago
- fix reported stats (count 1 replica, not 2, if it appears twice on the same device ID at different mounts)
- de-duplicate index calls for RW-mounted devices (retrieve each index once, and apply it to all mounts with the same device ID)
Updated by Tom Clegg over 6 years ago
- Status changed from In Progress to Resolved