Project

General

Profile

Keep server » History » Revision 10

Revision 9 (Tom Clegg, 04/08/2014 12:58 PM) → Revision 10/13 (Tom Clegg, 04/08/2014 12:59 PM)

h1. Keep server 

 This page describes the Keep backing store server component, keepd. 

 {{toc}} 

 See also: 
 * [[Keep]] (overview, design goals, client/server responsibilities, intro to content addressing) 
 * [[Keep manifest format]] 
 * [[Keep index]] 
 * source:services/keep (implementation: in progress) 

 h2. Todo 

 * Implement server daemon (*in progress*) 
 * Implement integration test suite (*in progress*) 
 * Spec public/private key format and deployment mechanism 
 * Spec permission signature format 
 * Spec event-reporting API 
 * Spec quota mechanism 

 h2. Responsibilities 

 * Read and write blobs on disk 
 * Remember when each blob was last written[1] 
 * Enforce maximum blob size 
 * Enforce key=hash(value) during read and write 
 * Enforce permissions when reading data (according to permissions on Collections in the metadata DB) 
 * Enforce usage quota when writing data 
 * Delete blobs (only when requested by data manager!) 
 * Report read/write/exception events 
 * Report used & free space 
 * Report hardware status (SMART) 
 * Report list of blobs on disk (hash, size, time last stored) 

 fn1. This helps with garbage collection. Re-writing an already-stored blob should push it to the back of the garbage collection queue. Ordering garbage collection this way provides a fair and more or less predictable interval between write (from the client's perspective) and earliest potential deletion. 

 h2. Other parties 

 * Client distributes data across the available Keep servers (using the content hash) 
 * Client attains initial replication level when writing blobs (by writing to multiple Keep servers) 
 * Data manager decides which blobs to delete (e.g., garbage collection, rebalancing) 

 h2. Discovering Keep server URIs 

 * @GET https://endpoint/arvados/v1/keep_disks@ 
 * see http://doc.arvados.org/api/schema/KeepDisk.html 
 * Currently "list of Keep servers" is "list of unique {host,port} across all Keep disks". (Could surely be improved.) 

 h2. Supported methods 

 For storage clients 
 * GET /hash 
 * GET /hash?checksum=true → verify checksum before sending 
 * POST / (body=content) → hash 
 * PUT /hash (body=content) → hash 
 * HEAD /hash → does it exist here? 
 * HEAD /hash?checksum=true → read the data and verify checksum 

 For system (monitoring, indexing, garbage collection) 
 * DELETE /hash → delete all copies of this blob (requires privileged token!) 
 * GET /index.txt → get full list of blocks stored here, including size and timestamp of most recent PUT (requires privileged token) 
 * GET /state.json → get list of backing filesystems, disk fullness, IO counters, perhaps recent IO statistics (requires privileged token) 

 Example index.txt: 

 <pre> 
 37b51d194a7513e45b56f6524f2d51f2+3 1396976219 
 acbd18db4cc2f85cedef654fccc4a4d8+3 1396976187 
 </pre> 

 Example status.json: 

 <pre><code class="javascript"> 
 { 
  "volumes":[ 
   {"mount_point":"/data/disk0","bytes_free":4882337792,"bytes_used":5149708288}, 
   {"mount_point":"/data/disk1","bytes_free":39614472192,"bytes_used":3314229248} 
  ] 
 } 
 </code></pre> 

 h2. Authentication 

 * Client provides API token in Authorization header 
 * Config knob to ignore authentication & permissions (for fully-shared site, and help transition from Keep1) 

 

 h2. Permission 

 A signature token, unique to a {blob_hash, arvados_api_token, expiry_time}, establishes permission to read a block. 

 The controller and each Keep server has a private key. Everyone can know the public keys (but only the controller and keep servers need to know them; clients don't need to verify signatures). 

 Writing: 
 * If the given hash and content agree, whether or not a disk write is required, Keep server creates a +Asignature@expirytime portion to the returned blob locator. 
 * The API server @collections.create@ method verifies signatures before giving the current user can_read permission on the collection. 
 * A suitably intelligent client can notice that the expirytimes on its blob hashes are getting old, and refresh them by generating a partial manifest, calling @collections.create@ followed by @collections.get@, and optionally deleting the partial manifest(s) when the full manifest is written. If extra partial manifests are left around, garbage collection will take care of them eventually; the only odd side effect is the existence of partial manifests. *(Should there be a separate "refresh all of these tokens for me" API call to avoid creating these intermediate manifests?)* 

 Reading: 
 * The API server @collections.get@ method returns two manifests. One has plain hashes (this is the one whose content hash is the collection UUID). The other has a @+Asignature@expirytime@ portion on each blob locator. 
 * Keep server verifies signatures before honoring @GET@ requests. 
 * The signature might come from either the Keep node itself, a different Keep node, or the API server. 
 * A suitably intelligent client can notice that the expirytime on its blob hashes is too old, and request a fresh set via @collections.get@.