Project

General

Profile

Keep manifest format » History » Version 3

Tom Clegg, 03/02/2015 10:05 PM

1 3 Tom Clegg
{{toc}}
2
3 1 Tom Clegg
h1. Keep manifest format
4
5
h2. Manifest v1
6
7 2 Tom Clegg
A manifest is utf-8 encoded text, consisting of one or more newline-terminated streams.
8 1 Tom Clegg
9
Each stream consists of three or more space-delimited tokens:
10
* The first token is a stream name, consisting of one or more tokens, delimited by @"/"@, the first of which is always @"."@.
11
* The second token is a data blob locator, consisting of one or more tokens, delimited by @"+"@, the first of which is an MD5 hexdigest.
12
** If a subsequent token ("hint") in the locator is numeric, it indicates the size of the data blob, in bytes.
13
** If a hint starts with @"A"@, it is an authorization token (used by the Keep server to confirm that the block is readable by a specific API auth token).
14
* ...possibly followed by more data blob locators...
15
* The first token that is not a block locator, and all subsequent tokens, are file tokens.
16
** A file token has three parts, delimited by @":"@: position, size, filename.
17
** Position and size are given in decimal, and are counted from the beginning of the first data blob.
18
** Filename may contain @"/"@ characters, but must not start or end with @"/"@, and must not contain @"//"@.
19
20
h2. Normalized manifest v1
21
22
A normalized manifest has the following additional restrictions.
23
* Streams are in alphanumeric order.
24
* Each stream name is unique within the manifest.
25
* Files within a stream are in alphanumeric order.
26
* -Concatenation @stream_name/filename@ is unique within the manifest.- (This can be impossible to accomplish without rewriting blobs.)
27
* Filename must not contain @"/"@.
28
29
An API call -exists- will exist to normalize a manifest.
30
31
Request:
32
* @POST /arvados/v1/collections/{hash}/normalize@
33
* request body: @{"collection":{"manifest_text":"...."}}@
34
35
Response:
36
* @{"uuid":"...","manifest_text":"..."}@
37
38
Notes:
39
* POST despite no side effects.
40
* Returns object with uuid even though no object was stored.
41 3 Tom Clegg
42
h2. Manifest v2
43
44
(Early design stages)
45
46
Should probably include:
47
* Structured format (JSON?)
48
* More than one level of indirection (e.g., manifest references block X, which references data blocks A,B,C)
49
* Specify hash algorithm with block hashes