Keep service hints » History » Version 13
Tom Clegg, 04/02/2015 03:45 PM
1 | 9 | Tom Clegg | {{>toc}} |
---|---|---|---|
2 | |||
3 | 1 | Tom Clegg | h1. Keep service hints |
4 | |||
5 | 9 | Tom Clegg | h2. Overview |
6 | |||
7 | 1 | Tom Clegg | h3. Objective |
8 | |||
9 | 10 | Tom Clegg | Service hints are the mechanism used by Keep client libraries to read data that is _not_ stored in Keep, by making use of Keep gateway services. |
10 | |||
11 | How the gateway services work is not covered here: this document only addresses how clients should decide _when_ to use a gateway service, and _which one_ to use. |
||
12 | |||
13 | Service hints do not address the matter of _writing_ data to remote services, or de-duplicating writes across various services: if a client reads some data from a remote service using arv-get, and writes it back using arv-put, this will result in an additional local copy. |
||
14 | |||
15 | The intended audience for this document is software engineers. |
||
16 | |||
17 | h3. Background |
||
18 | |||
19 | 11 | Tom Clegg | h4. Motivation |
20 | |||
21 | 7 | Tom Clegg | Users should be able to create, manage, share, and run jobs on collections whose underlying data is stored in remote services like Amazon S3 and Google Cloud Storage. Users (and their compute jobs) should use the same tools and interfaces, regardless of whether the data is stored in such a remote service or natively in Keep. |
22 | |||
23 | 1 | Tom Clegg | Examples: |
24 | 5 | Tom Clegg | * A compute job that processes locally stored data should not have to be modified at all in order to process remote data. |
25 | * A user should be able to use Workbench to share a collection with another user, without knowing whether the underlying data is stored locally or in a remote service. |
||
26 | 13 | Tom Clegg | * Arvados should be able to move data from one storage system to another without disrupting users. For example, the @portable_data_hash@[1] of a collection must not change when the underlying data moves. |
27 | 1 | Tom Clegg | * It should be possible for a collection to reference some data stored in remote service A, some data stored in remote service B, and some data stored on local Keep disks. |
28 | 12 | Tom Clegg | |
29 | fn1. The portable_data_hash attribute of a Collection record is a cryptographic hash of the data. See http://doc.arvados.org/api/schema/Collection.html (but note this doc page is currently out of date) |
||
30 | 1 | Tom Clegg | |
31 | 11 | Tom Clegg | h4. Current behavior |
32 | 5 | Tom Clegg | |
33 | 4 | Tom Clegg | Currently, in order to use Arvados to work with data stored in a remote service (e.g., use it as an input to a Crunch job), a user must download it from the remote service and store it in Keep, typically using a shell VM as an intermediary. |
34 | 10 | Tom Clegg | * <pre> |
35 | 5 | Tom Clegg | curl https://... | arv-put - |
36 | 1 | Tom Clegg | </pre> |
37 | 10 | Tom Clegg | * This is inefficient. The entire dataset must be transferred from the source to the shell VM, and from there to Keep. |
38 | 5 | Tom Clegg | * It is inconvenient. A user must figure out which data _might_ be used in a given process, and download all of it to Keep before starting. |
39 | 1 | Tom Clegg | * It uses a lot of storage space (in a typical use case, this approach stores two additional copies of the data on Keep disks, even though the user does not desire additional replication beyond what is provided by the remote service). |
40 | * It is error-prone: it is easy for a user's "download and store in Keep" script to miss checking an exit code and store an incomplete dataset, and this might only be discovered much later (or not at all). |
||
41 | 11 | Tom Clegg | |
42 | h3. Alternatives |
||
43 | 1 | Tom Clegg | |
44 | Client libraries could communicate directly with non-Keep services. |
||
45 | * It would be impossible to use Arvados permission controls. |
||
46 | * An N×M array of code would have to be maintained in order to support N backing services from M SDK languages. |
||
47 | 6 | Tom Clegg | * The API server would have to maintain the mapping of hashes to remote data objects (and permissions for this map). |
48 | 1 | Tom Clegg | * It would be much more difficult (or impossible) to monitor usage. |
49 | |||
50 | Each keepstore server could know how to communicate with each non-Keep service in use. |
||
51 | * Simpler client code. |
||
52 | * Artificial link between keep disk services and gateway services (they couldn't be independently scaled or shut down for maintenance). |
||
53 | * External clients couldn't be given direct access to the third-party gateway services without also giving them direct access to the disk services. |
||
54 | * Either the keepstore servers would have to keep their hash-to-remote-object mappings synchronized -- or the map of hash to remote service would be distributed across various servers. Either way introduces an unsuitable level of complexity: unlike in a native keepstore system, the underlying data is expected to change over time. |
||
55 | * When encountering an error (notably 404), client code would make many redundant attempts to read from various gateway services, based on the mistaken assumption that the various services have different sets of available data blocks. |
||
56 | |||
57 | h3. High level design |
||
58 | |||
59 | Clients interact with remote services through Keep gateway services. A gateway server responds to GET requests using the same protocol as a keepstore server. Instead of reading data from a local disk, though, it reads data from a remote service. Generally this means it maintains a local database mapping Keep locators (hashes) to remote data locators (and possibly credentials). From the client's perspective, its behaves exactly the same as any other keep service: @"GET /locator"@ returns either an error, or a data block whose MD5 hex digest is equal to the first 32 characters of the locator. |
||
60 | |||
61 | This means tools (see [[Keep S3 gateway]]) can create manifests with @+Kuuid@ hints, referencing data in remote storage services by indicating the UUID of a storage gateway capable of accessing it. |
||
62 | |||
63 | Each client library, when encountering a locator with a @+Kuuid@ hint, skips the usual rendezvous hashing algorithm. Instead of requesting a list of available services from the API server and sorting them in rendezvous order, it requests the particulars for the one specified service, and connects to it in order to retrieve the data. |
||
64 | 6 | Tom Clegg | |
65 | 1 | Tom Clegg | Aside from the choice of which Keep service to contact, the form and semantics of the "retrieve data" transaction are unchanged. |
66 | |||
67 | h2. Specifics |
||
68 | 6 | Tom Clegg | |
69 | h3. Detailed design |
||
70 | 5 | Tom Clegg | |
71 | 10 | Tom Clegg | With the existing client libraries, when a client reads a data block referenced by a manifest, it requests a list of "available keep services" from the API server, uses the rendezvous hashing algorithm to sort them, and contacts them in sorted order until the data is found. |
72 | |||
73 | Current client libraries recognize hints of the form @+Kzzzzz@ (where zzzzz is an Arvados site prefix), indicating that the data should be retrieved from the Keep proxy service at a remote Arvados instance. |
||
74 | |||
75 | Service hints extend this approach by allowing the manifest to specify a Keep service endpoint for a data block. |
||
76 | |||
77 | 6 | Tom Clegg | A block locator provided by the API server in a manifest might have a hint of the form @+Kuuid@ where @uuid@ is the UUID of a keep service. In order to retrieve the block data, the client should look up the keep service with the given UUID, and perform an HTTP @GET@ request at the appropriate host and port. |
78 | 2 | Tom Clegg | |
79 | 1 | Tom Clegg | * Given @acbd18db4cc2f85cedef654fccc4a4d8+3+K1h9kt-bi6l4-20fty0xbp8l9wwe@, |
80 | 2 | Tom Clegg | ** Retrieve @https://1h9kt.arvadosapi.com/arvados/v1/keep_services/1h9kt-bi6l4-20fty0xbp8l9wwe@ to determine scheme, host, port |
81 | ** Retrieve data from @{scheme}://{host}{port}/acbd18db4cc2f85cedef654fccc4a4d8+3+K1h9kt-bi6l4-20fty0xbp8l9wwe@ |
||
82 | |||
83 | As before, if a hint of the form @+K{prefix}@ is given (where @{prefix}@ is a string of five characters in @[0-9a-z]@), the client should perform a @GET@ request at @https://keep.{prefix}.arvadosapi.com/locator@. |
||
84 | |||
85 | 3 | Tom Clegg | * Given @acbd18db4cc2f85cedef654fccc4a4d8+3+K1h9kt@, |
86 | 1 | Tom Clegg | ** Retrieve data from @https://keep.1h9kt.arvadosapi.com/acbd18db4cc2f85cedef654fccc4a4d8+3+K1h9kt@ |
87 | 10 | Tom Clegg | |
88 | As before, if neither of the above hints is present, the client should use the rendezvous hashing algorithm on the list of available Keep services. |
||
89 | 9 | Tom Clegg | |
90 | 1 | Tom Clegg | h3. Future work |
91 | 8 | Tom Clegg | |
92 | 6 | Tom Clegg | Arvados could manage service hints actively: for example, data manager could tag blocks with S3 bucket names, and API server could load-balance S3 gateways by selecting one of several available gateway UUIDs for a given block. (This would not require any further changes in client libraries.) |
93 | 8 | Tom Clegg | |
94 | 6 | Tom Clegg | Data manager could update manifests to reflect additional locations where data blocks can be retrieved: for example, @+Kuuid1+Kuuid2@ to signify that multiple remote gateways can retrieve the data, or @+K+Kuuid1@ to signify that the data is available locally _and_ via a remote gateway. (This would require some backward-compatible changes in client libraries.) |
95 | |||
96 | 1 | Tom Clegg | A gateway could permit Keep clients to write to a remote service. Service hints don't exist when data is being written, so clients would need some other way to decide when to write to a gateway server instead of a regular Keep disk/proxy service. |