Keep S3 gateway » History » Version 1
Tom Clegg, 03/23/2015 08:55 PM
1 | 1 | Tom Clegg | h1. Keep S3 gateway |
---|---|---|---|
2 | |||
3 | See [[Keep service hints]] for more background. |
||
4 | |||
5 | h2. High level design |
||
6 | |||
7 | Each remote storage service (e.g, S3 bucket) in use at a given Arvados installation is supported by one keep server process, running with a flag like @-volume=s3:/mappings:bucketname:s3credentials@ instead of @-volumes=/tmp/1,/tmp/2@. |
||
8 | |||
9 | h2. Specifics |
||
10 | |||
11 | Likely, some parts of keepproxy and keepstore should be refactored to share code more effectively. |
||
12 | * keepstore logs & answers client queries, verifies hashes, answers index/status queries, reads/writes data blocks on disk, enforces per-disk mutexes. |
||
13 | * keepproxy logs & answers client queries, verifies hashes, connects to other keep services. |
||
14 | * keepgw logs & answers client queries, verifies hashes, answers index/status queries, reads/writes a local {hash, remote object} index, connects to remote services. |
||
15 | |||
16 | Possibilities: |
||
17 | * Refactor the keepstore command to consist of just the "unix volume" code; move everything else into packages like keep_server and hash_checking_reader. Create a new keepgw-s3 command. |
||
18 | * Extend the keepstore command to use backing-store modules like -volume=unix:/foo and -volume=s3:bucketid. |
||
19 | * Extend the keepproxy command to use backing-store modules like S3 as an alternative to keep disk services. |
||
20 | |||
21 | The {hash, remote object} mapping can be stored in the local filesystem. |
||
22 | * A given hash can map to more than one remote object. It's worth remembering all such remote objects: if one disappears or changes, a different one should be attempted next. Suggestion: For each hash, we have a text file with one line per remote data object matching the hash. |
||
23 | * When remote objects are bigger than 64 MiB, the mapping will actually be {hash, remote object segment}. This should be easy to manage if remote object references are always stored as @"offset:length:remote_object_path"@. |
||
24 | |||
25 | h2. Related changes |
||
26 | |||
27 | When using local filesystems as data stores, keepstore should accept @-volume=/tmp/foo -volume=/tmp/bar@ (in addition to @-volumes=/tmp/foo,/tmp/bar@ for backward compatibility). See https://golang.org/src/flag/example_test.go |