Project

General

Profile

Objects as pseudo-blocks in Keep » History » Version 2

Peter Amstutz, 05/28/2024 08:45 PM

1 1 Peter Amstutz
h1. Objects as pseudo-blocks in Keep
2
3
Idea for accessing external objects via Keep (specifically S3)
4
5
The way we've bounced around for a while has been to take an object, split it into 64 MiB blocks, and record, each block hash in a database along with a reference to the object and offset.
6
7 2 Peter Amstutz
Here is a different approach to this idea.  (Tom floated a version of this at one of our engineering meetings but I don't think we fully explored it at the time).
8 1 Peter Amstutz
9
For an s3 object of 1234 bytes long located at s3://bucket/key
10
11
ffffffffffffffffffffffffffffffff+512+B(base64 encode of s3://bucket/key)+C256
12
13 2 Peter Amstutz
The ffff... indicates it is a special block (we could also use 0000... or 0f0f0f... etc).  Another idea would be to use a hash of the size, @+B@ and @+C@ hints.  Alternately S3 also offers checksums of files, so we could use the MD5 of the full object.
14 1 Peter Amstutz
15 2 Peter Amstutz
* It is 512 bytes long.
16
* The hint @+B@ means data should be fetched from the s3:// URL which is base64 encoded (this is necessary to match our locator syntax).
17
* The hint @+C@ means read from offset 256 bytes.
18 1 Peter Amstutz
19
Large files can be split, e.g.
20
21
ffffffffffffffffffffffffffffffff+67108864+B(base64 encode of s3://bucket/key)+C0 ffffffffffffffffffffffffffffffff+67108864+B(base64 encode of s3://bucket/key)+C67108864 ffffffffffffffffffffffffffffffff+67108864+B(base64 encode of s3://bucket/key)+C134217728
22
23
However this repeats the the +B portion a bunch of times, so we could allow the manifest to describe oversized blocks:
24
25
ffffffffffffffffffffffffffffffff+1000000000+B(base64 encode of s3://bucket/key)+C0
26
27
Implementation-wise, this would be split into 64 MiB chunks at runtime when the manifest is loaded.  The block cache would need to use the full locator (with +B and +C).
28
29 2 Peter Amstutz
Add support for locators of this type to Keepstore, which already has code to interact with S3 buckets.  This avoids adding such code to the client.
30 1 Peter Amstutz
31
Keepstore would need to be able to read the buckets.  This could be done either with a blanket policy (allow keepstore/compute nodes to read specific buckets) and/or by adding a feature to store AWS credentials in Arvados  in a way such that Keepstore, having the user's API token, is able to fetch them and use them (such as on the API token record).
32
33 2 Peter Amstutz
For S3 specifically, if we include @?versionId=@ on all URLs, the blocks can be assumed to be immutable.  
34
35
Advantages
36
37
* This strategy is a lot like how we approach federation.
38
* If locators of this type are supported by Keepstore, then Go and Python SDKs require relatively few changes (they continue to blocks from Keepstore).
39
* Does not require downloading and indexing files
40
41
Disadvantages
42
43
* Can't verify file contents.