Actions
Keep manifest format » History » Revision 5
« Previous |
Revision 5/8
(diff)
| Next »
Tom Clegg, 06/13/2015 05:03 AM
- Table of contents
- Keep manifest format
Keep manifest format¶
Manifest v1¶
A manifest is utf-8 encoded text, consisting of one or more newline-terminated streams.
Each stream consists of three or more space-delimited tokens:- The first token is a stream name, consisting of one or more path components, delimited by
"/"
.- The first path component is always
"."
. - No path component is empty.
- No path component is "." or "..".
- The stream name never begins or ends with
"/"
.
- The first path component is always
- The second token is a data blob locator, consisting of one or more tokens, delimited by
"+"
, the first of which is an MD5 hexdigest.- If a subsequent token ("hint") in the locator is numeric, it indicates the size of the data blob, in bytes.
- If a hint starts with
"A"
, it is an authorization token (used by the Keep server to confirm that the block is readable by a specific API auth token).
- ...possibly followed by more data blob locators...
- The first token that is not a block locator, and all subsequent tokens, are file tokens.
- A file token has three parts, delimited by
":"
: position, size, filename. - Position and size are given in decimal, and are counted from the beginning of the first data blob.
- Filename may contain
"/"
characters, but must not start or end with"/"
, and must not contain"//"
. - Filename components (delimited by
"/"
) must not be"."
or".."
.
- A file token has three parts, delimited by
A manifest contains no TAB characters, nor other ASCII whitespace characters other than the spaces or newline delimiters specified above.
Normalized manifest v1¶
A normalized manifest has the following additional restrictions.- Streams are in alphanumeric order.
- Each stream name is unique within the manifest.
- Files within a stream are in alphanumeric order.
Concatenation(This can be impossible to accomplish without rewriting blobs.)stream_name/filename
is unique within the manifest.- Filename must not contain
"/"
.
An API call exists will exist to normalize a manifest.
POST /arvados/v1/collections/{hash}/normalize
- request body:
{"collection":{"manifest_text":"...."}}
{"uuid":"...","manifest_text":"..."}
- POST despite no side effects.
- Returns object with uuid even though no object was stored.
Manifest v2¶
(Early design stages)
Should probably include:- Structured format (JSON?)
- More than one level of indirection (e.g., manifest references block X, which references data blocks A,B,C)
- Specify hash algorithm with block hashes
Updated by Tom Clegg over 9 years ago · 8 revisions