Project

General

Profile

Idea #7988

Updated by Peter Amstutz over 8 years ago

The S3 API does not support transactional deletes, so there is a race condition between checking the timestamp of a block and deleting the block, where the block can have its metadata refreshed but then deleted. 

 While this could potentially be solved using AWS S3 object versioning, this solution is not available with other storage systems that provide S3 compatible APIs, such as Google and Ceph. 

 Proposed solution: 

 # Designate a single keepstore to handle trash lists for a given S3 bucket. 
 # On PUT, if the blocks are new or of an existing block that is less than 2 weeks old, can be handled by any server 
 # Otherwise, do PUT-copy to update new block to "hash.copy" 
 # For each block on the trash list: 
 ## Issue delete 
 ## Try to PUT-copy from "hash.copy" to "hash" (ignore failure) 
 # When the trash list is empty (because we finished processing, or an empty trash list was received from data manager), search and delete all blocks matching the pattern "*.copy" 

Back