Keep storage tiers » History » Version 8
Peter Amstutz, 06/13/2017 08:03 PM
1 | 1 | Tom Clegg | h1. Keep storage tiers |
---|---|---|---|
2 | |||
3 | 2 | Tom Clegg | Typically, an Arvados cluster has access to multiple storage devices with different cost/performance trade-offs. |
4 | 1 | Tom Clegg | |
5 | Examples: |
||
6 | * Local SSD |
||
7 | * Local HDD |
||
8 | * Object storage service provided by cloud vendor |
||
9 | * Slower or less reliable object storage service provided by same cloud vendor |
||
10 | 2 | Tom Clegg | |
11 | Users should be able to specify a minimum storage tier for each collection. Arvados should ensure that every data block referenced by a collection is stored at the specified tier _or better_. |
||
12 | |||
13 | The cluster administrator should be able to specify a default tier, and assign a tier number to each storage device. |
||
14 | |||
15 | 3 | Tom Clegg | It should be possible to configure multiple storage devices at the same tier: for example, this allows blocks to be distributed more or less uniformly across several (equivalent) cloud storage buckets for performance reasons. |
16 | |||
17 | h1. Implementation (proposal) |
||
18 | |||
19 | Storage tier features (and implementation) are similar to replication-level features. |
||
20 | |||
21 | h2. Configuration |
||
22 | |||
23 | 5 | Tom Clegg | Each Keep volume has an integer parameter, "tier". Interpretation is site-specific, except that when M≤N, tier M can satisfy a requirement for tier N, i.e., smaller tier numbers are better. Some volume drivers are capable of discovering the tier number for a volume by inspecting the underlying storage device (e.g., a cloud storage bucket) but in all cases a sysadmin can specify a value. |
24 | 3 | Tom Clegg | |
25 | 5 | Tom Clegg | There is a site-wide default tier number which is used for collections that do not specify a desired tier. Typically this is tier 1. |
26 | 3 | Tom Clegg | |
27 | h2. Storing data at a non-default tier |
||
28 | |||
29 | Tools that write data to Keep should allow the caller to specify a storage tier. The desired tier is sent to Keep services as a header (X-Keep-Desired-Tier) with each write request. Keep services return an error when the data cannot be written to the requested tier (or better). |
||
30 | |||
31 | h2. Moving data between tiers |
||
32 | |||
33 | Each collection has an integer field, "tier_desired". If tier_desired is not null, all blocks referenced by the collection should be stored at the given tier (or better). |
||
34 | 1 | Tom Clegg | |
35 | Keep-balance tracks the maximum allowed tier for each block, and moves blocks between tiers as needed. The strategy is similar to fixing rendezvous probe order: if a block is stored at the wrong tier, a new copy is made at the correct tier; then, in a subsequent balancing operation, the redundant copy is detected and deleted. _This increases the danger of data loss due to races between concurrent keep-balance processes. Keep-balance should have a reliable way to detect/avoid concurrent balancing operations._ |
||
36 | 5 | Tom Clegg | |
37 | (Note: the following section uses the term "mount" to mean what the keepstore code base calls a "volume": i.e., an attachment of a storage device to a keepstore process.) |
||
38 | |||
39 | 7 | Tom Clegg | Keepstore provides APIs that let keep-balance see which blocks are stored on which mount points, and copy/delete blocks to/from specific mount points: |
40 | * Get current mounts |
||
41 | * List blocks on a specified mount |
||
42 | * Delete block from a specified mount |
||
43 | * Pull block to a specified mount |
||
44 | 3 | Tom Clegg | |
45 | 1 | Tom Clegg | h2. Reporting |
46 | |||
47 | After each rebalance operation, keep-balance logs a summary of discrepancies between actual and desired allocation of blocks to storage tiers. Examples: |
||
48 | * N blocks (M bytes) are stored at tier 3 but are referenced by collections at tier 2. |
||
49 | * N blocks (M bytes) are stored at tier 1 but are not referenced by any collections at tier T<2. |
||
50 | 7 | Tom Clegg | |
51 | h1. Development tasks |
||
52 | |||
53 | * #11644 keepstore: mount-oriented APIs |
||
54 | * #11645 keepstore: configurable tier per volume/mount |
||
55 | * #11646 keepstore: support x-keep-desired-tier header |
||
56 | * apiserver: collections.tier_desired column, site default tier |
||
57 | * keep-balance: report storage tier equilibrium |
||
58 | 8 | Peter Amstutz | |
59 | h1. Alternate proposal (PA) |
||
60 | |||
61 | On further consideration of customer needs, the original assumption of a roughly linear relationship between slow/cheap and fast/expensive doesn't necessarily hold. |
||
62 | |||
63 | Propose that storage tiers are independent. There is no implied relationship between tiers. Tiers can be identified with labels or uuids instead of integers. The keep services table adds a column which lists which tiers are available at which services. |
||
64 | |||
65 | When writing blocks, keepstore recognizes a header @X-Keep-Tier@ and accepts or denies the block based on whether it can place the block in the designated tier. If not supplied, keepstores should have a default tier. The value of @X-Keep-Tier@ should be reported in the response. |
||
66 | |||
67 | A keepstore mount is associated with a specific tier. |
||
68 | |||
69 | Collections specify the desired tier for the blocks in the collection. Keep balance should move blocks to the desired tier. If multiple collections reference the same block in different tiers, each tier should have a copy. |