Keep » History » Version 2
Tom Clegg, 04/10/2013 06:04 PM
1 | 1 | Tom Clegg | h1. Keep |
---|---|---|---|
2 | 1 | Tom Clegg | |
3 | 2 | Tom Clegg | Keep is a distributed content-addressable storage system designed for high performance in I/O-bound cluster environments. |
4 | 2 | Tom Clegg | |
5 | 2 | Tom Clegg | Notable design goals and features include: |
6 | 2 | Tom Clegg | |
7 | 2 | Tom Clegg | * High scalability |
8 | 2 | Tom Clegg | * Node-level redundancy |
9 | 2 | Tom Clegg | * Maximum overall throughput in a busy cluster environment |
10 | 2 | Tom Clegg | * Maximum data bandwidth from client to disk |
11 | 2 | Tom Clegg | * Minimum transaction overhead |
12 | 2 | Tom Clegg | * Elimination of disk thrashing (commonly caused by multiple simultaneous readers) |
13 | 2 | Tom Clegg | * Client-controlled redundancy |
14 | 2 | Tom Clegg | |
15 | 2 | Tom Clegg | h2. Design |
16 | 2 | Tom Clegg | |
17 | 2 | Tom Clegg | The above goals are accomplished by the following design features. |
18 | 2 | Tom Clegg | |
19 | 2 | Tom Clegg | * Data is transferred directly between the client and the physical node where the disk is installed. |
20 | 2 | Tom Clegg | * Data collections are encoded in large (≤64 MiB) blocks to minimize short read/write operations. |
21 | 2 | Tom Clegg | * Each disk accepts only one block-read/write operation at a time. This prevents disk thrashing and maximizes total throughput when many clients compete for a disk. |
22 | 2 | Tom Clegg | * Storage redundancy is directly controlled, and can be easily verified, by the client simply by reading or writing a block of data on multiple nodes. |
23 | 2 | Tom Clegg | * Data block distribution is computed based on the MD5 digest of the data block being stored or retrieved. This eliminates the need for a central or synchronized database of block storage locations. |
24 | 2 | Tom Clegg | |
25 | 2 | Tom Clegg | h2. Components |
26 | 2 | Tom Clegg | |
27 | 1 | Tom Clegg | The Keep storage system consists of data block read/write services, SDKs, and management agents. |
28 | 1 | Tom Clegg | |
29 | 1 | Tom Clegg | The responsibilities of the Keep service are: |
30 | 1 | Tom Clegg | |
31 | 1 | Tom Clegg | * Write data blocks |
32 | 1 | Tom Clegg | * When writing: ensure data integrity by comparing client-supplied MD5 digest to client-supplied data |
33 | 1 | Tom Clegg | * Read data blocks (subject to permission, which is determined by the system/metadata DB) |
34 | 1 | Tom Clegg | * Send read/write/error event logs to management agents |
35 | 1 | Tom Clegg | |
36 | 1 | Tom Clegg | The responsibilities of the SDK are: |
37 | 1 | Tom Clegg | |
38 | 1 | Tom Clegg | * When writing: split data into ≤64 MiB chunks |
39 | 1 | Tom Clegg | * When writing: encode directory trees as manifests |
40 | 1 | Tom Clegg | * When writing: write data to the desired number of nodes to achieve storage redundancy |
41 | 1 | Tom Clegg | * After writing: register a collection with Arvados |
42 | 1 | Tom Clegg | * When reading: parse manifests |
43 | 1 | Tom Clegg | * When reading: verify data integrity by comparing locator to MD5 digest of retrieved data |