Project

General

Profile

Actions

Keep storage classes » History » Revision 5

« Previous | Revision 5/15 (diff) | Next »
Peter Amstutz, 06/13/2017 08:40 PM


Keep storage pools

Use cases

  • User has option to store some data in cheaper storage, but only certain data qualifies. Can be indicated on a per-collection basis.
  • User wants data moved from "hot" to "cool" storage a certain amount of time after it has been generated.

Design

A "pool" is effectively a tagging scheme to specify a subset of keep servers where a block should be preferentially stored.

Related to (but not the same thing as) Keep storage tiers. For some use cases, the assumption of a roughly linear relationship between slow/cheap and fast/expensive doesn't necessarily hold.

Each service has access to one or more storage pools. Storage pools are independent. There is no implied relationship between pools. Data assigned to a pool may still be sharded among multiple servers. Pools can be identified with labels or uuids instead of integers. The keep services table adds a column which lists which pools are available at which services.

When writing blocks, keepstore recognizes a header X-Keep-Pool and accepts or denies the block based on whether it can place the block in the designated pool. If not supplied, keepstores should have a default pool. The value of X-Keep-Pool should be reported in the response.

A keepstore mount is associated with a specific pool.

Collections may specify a desired pool for the blocks in the collection. Keep balance should move blocks to the desired pool. If multiple collections reference the same block in different pools, each pool should have a copy.

Data management policies, for example "move data from hot storage to cold storage if not accessed after 1 month", should be implemented with additional tooling/scripts on top of the keepstore later.

Updated by Peter Amstutz about 5 years ago · 5 revisions