Project

General

Profile

Actions

Bug #18547

closed

[keep-balance] Avoid redundant indexing when multiple keepstore servers use a single NFS mount

Added by Tom Clegg 7 months ago. Updated 7 months ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
12/06/2021
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-
Release relationship:
Auto

Description

Background: Currently keep-balance detects when a storage device like an S3 bucket is used by multiple keepstore servers, and arbitrarily chooses one of them to get the index. However, this relies on the "device ID" returned by keepstore, which is
  • s3://endpoint/bucketname, if the volume is an S3 bucket
  • block device UUID, if the volume is a local filesystem
  • empty, if the volume is a network-mounted filesystem

When the device ID is empty, the volumes might be different, so keep-balance indexes all of them.

Now that each keepstore server uses the same configuration file, each configured volume has a unique UUID, and the volume UUID is returned in the list of mounts reported by keepstore (none of which were true when the "device ID" approach started), keep-balance should detect identical/duplicate mounts by comparing volume UUIDs instead of device IDs.

(Note this will confuse keep-balance if the config uses a single volume UUID to mount a local disk like "/data" on multiple keepstore machines. But the install docs explicitly describe not doing that, and it is not a kind of configuration we want to support. Worst outcome is that someone with this kind of wonky config would see a lot of blocks misreported as underreplicated or missing by keep-balance until they fix their config.)


Subtasks 1 (0 open1 closed)

Task #18561: Review 18547-use-volume-uuid-not-device-idResolvedPeter Amstutz12/06/2021

Actions

Related issues

Related to Arvados - Bug #18376: [keepstore] Avoid long-lived readdirent cookies in filesystem driverResolvedTom Clegg11/16/2021

Actions
Blocks Arvados - Story #18518: Release Arvados 2.3.2ResolvedPeter Amstutz12/06/2021

Actions
Actions #1

Updated by Tom Clegg 7 months ago

  • Related to Bug #18376: [keepstore] Avoid long-lived readdirent cookies in filesystem driver added
Actions #2

Updated by Tom Clegg 7 months ago

  • Status changed from New to In Progress
Actions #3

Updated by Tom Clegg 7 months ago

Actions #4

Updated by Tom Clegg 7 months ago

Wondering whether we want a more limited version of this for 2.3.2 ("only use UUID if deviceID is empty") just in case the full version affects non-NFS setups in an unexpected way...

Actions #5

Updated by Tom Clegg 7 months ago

TODO: keep-balance should error out if two volumes return the same non-empty DeviceID.

Actions #6

Updated by Peter Amstutz 7 months ago

  • Release set to 48
Actions #7

Updated by Peter Amstutz 7 months ago

Actions #8

Updated by Tom Clegg 7 months ago

18547-use-volume-uuid-not-device-id @ 24f140f9ed1a2180541c0c7cebf7572c5155fe27 -- developer-run-tests: #2829
  • error out if two volumes return the same non-empty DeviceID
Actions #9

Updated by Peter Amstutz 7 months ago

Tom Clegg wrote:

18547-use-volume-uuid-not-device-id @ 24f140f9ed1a2180541c0c7cebf7572c5155fe27 -- developer-run-tests: #2829
  • error out if two volumes return the same non-empty DeviceID

This LGTM. Could you please merge into both main and 2.3-dev?

Actions #10

Updated by Tom Clegg 7 months ago

  • Status changed from In Progress to Resolved

Applied in changeset arvados-private:commit:arvados|920307882b3fe52a08b366a1c81e62f44ee639b9.

Actions #11

Updated by Tom Clegg 7 months ago

Merged, and cherry-picked e16866d0f and 24f140f9e onto 2.3-dev as 11864d817 and 56c37ef9b respectively.

Actions

Also available in: Atom PDF