Feature #11184

Updated by Tom Clegg almost 4 years ago

As an Arvados system administrator, I I'd want to the ability take advantage of the cool/cold storage classes offered by cloud vendors.

This involves designating desired would potentially require the ability to designate storage class(es) tier in some way (collection, all collections in a project, etc) as well as a way to migrate between storage classes. classes/locations,

[[Keep storage groups]] (previous [[Keep storage tiers]])

h2. Overview

* Each keep volume offers one or more
Keep volumes have storage classes (the default is just the "default" class). attributes that they offer.
* Each collection has one or more desired Collections have storage classes (the default is just the "default" class).
* When writing, clients _may_ specify one or more required storage classes; if not, the required class is "default". A keepstore
attributes that they desire. Blocks get written to same Keep store server will only write the data on a volume that offers _all_ of the required classes.
* Keep-balance
as today.
Keep balance
moves data to volumes that have the desired attributes and updates the collection records to reflect the actual storage classes currently satisfied by all blocks in the collection (much like replication level, these class (ie there are not necessarily equal separate requested storage class and current storage class). Keep balance needs to the desired classes). track blocks by volume rather than by server as it currently does.

If overlapping multiple collections (i.e., have the same block with common data blocks) request different storage classes, keep-balance will maintain the block gets stored multiple copies of times, once in each.

the common blocks if necessary to satisfy all collections' requirements. initial iteration, there is no client side support.

Microsoft Azure (pricing at 50-500 TB level)

h2. Initial implementation * LRS-COOL $0.01/GB/mo, $0.01/10Kops + $0.01GB
* LRS-HOT $0.0177/GB/mo, $0.05/10Kops
* GRS-COOL $0.02/GB/mo, $0.20/10Kops + $0.01/GB
* GRS-HOT $0.0354/GB/mo, $0.10/10Kops
* RAGRS-COOL $0.025/GB/mo, $0.20/10Kops + $0.01/GB
* RAGRS-HOST $0.0442/GB/mo, $0.10/10Kops

Amazon S3 (pricing at 50-500TB level)
* Standard - $0.022/GB, $0.004/10Kops (get)
* Infrequent Access - $0.0125/GB, $0.01/10Kops (get)
* Glacier - $0.004/GB + variable retrieval charge depending on speed

Simplifying restrictions: Google
* No client side support. Multi-Regional Storage $0.026/GB/mo
* No keepstore support for writing data to a given storage class. Regional Storage $0.02/GB/mo
* API server configuration specifies the set of classes that can be requested (in addition to "default", which is always available). Nearline Storage $0.01/GB/mo, $0.01/GB retrieval charge
* Keepstore configuration specifies the set of classes offered by each volume. Coldline Storage $0.007, $0.05/GB retrieval charge
* Optional bucket versioning