Project

General

Profile

Objects as pseudo-blocks in Keep » History » Revision 2

Revision 1 (Peter Amstutz, 05/28/2024 08:33 PM) → Revision 2/7 (Peter Amstutz, 05/28/2024 08:45 PM)

h1. Objects as pseudo-blocks in Keep 

 Idea for accessing external objects via Keep (specifically S3) 

 The way we've bounced around for a while has been to take an object, split it into 64 MiB blocks, and record, each block hash in a database along with a reference to the object and offset. 

 Here is a different approach to this new idea.    (Tom floated a version of this at one of our engineering meetings but I don't think we fully explored it at the time). 

 For an s3 object of 1234 bytes long located at s3://bucket/key 

 ffffffffffffffffffffffffffffffff+512+B(base64 encode of s3://bucket/key)+C256 

 The ffff... indicates it is a special block (we could also use 0000... or 0f0f0f... etc).    Another idea would be to use a hash of the size, @+B@ and @+C@ hints.    Alternately S3 also offers checksums of files, so we could use the MD5 of the full object. etc) 

 * It is 512 bytes long. 
 * 

 The hint @+B@ means data should be fetched from the s3:// URL which is base64 encoded (this is necessary to match our locator syntax). 
 * 

 The hint @+C@ means read from offset 256 bytes. 

 Large files can be split, e.g. 

 ffffffffffffffffffffffffffffffff+67108864+B(base64 encode of s3://bucket/key)+C0 ffffffffffffffffffffffffffffffff+67108864+B(base64 encode of s3://bucket/key)+C67108864 ffffffffffffffffffffffffffffffff+67108864+B(base64 encode of s3://bucket/key)+C134217728 

 However this repeats the the +B portion a bunch of times, so we could allow the manifest to describe oversized blocks: 

 ffffffffffffffffffffffffffffffff+1000000000+B(base64 encode of s3://bucket/key)+C0 

 Implementation-wise, this would be split into 64 MiB chunks at runtime when the manifest is loaded.    The block cache would need to use the full locator (with +B and +C). 

 Add support for If locators of this type to are supported by Keepstore, which already has code then Go and Python SDKs require relatively few changes (they continue to interact with S3 buckets.    This avoids adding such code to the client. blocks from Keepstore). 

 Keepstore would need to be able to read the buckets.    This could be done either with a blanket policy (allow keepstore/compute nodes to read specific buckets) and/or by adding a feature to store AWS credentials in Arvados    in a way such that Keepstore, having the user's API token, is able to fetch them and use them (such as on the API token record). 

 For S3 specifically, if we include @?versionId=@ on all URLs, the blocks can be assumed to be immutable.   

 Advantages 

 * This strategy is a lot like how we approach federation. 
 * If locators of this type are supported by Keepstore, then Go and Python SDKs require relatively few changes (they continue to blocks from Keepstore). 
 * Does not require downloading and indexing files 

 Disadvantages 

 * Can't verify file contents.