Project

General

Profile

Idea #6260

Updated by Tom Clegg over 8 years ago

Write one or more integration tests to verify that Data Manager's _existing_ delete functionality (i.e., deletes blocks that are unreferenced; does not do anything about overreplicated blocks) works as desired: 

 * Verify that blocks not referenced in a collection are deleted from keepstore  
 * Verify that all blocks referenced from collections, and all blocks newer than the block signature TTL, collections are never deleted from keepstore 

 Minimal test, covering a miniature version of normal operation: 
 * bring up api api, keepstore, and keepstore keepproxy services (just like we do already in keepproxy_test.go) 
 * store some collections (with non-zero data) 
 * store some data blocks without referencing them in any collection 
 * back-date the block Mtimes so that keepstore will consider unreferenced data old enough to delete 
 * write some "transient" blocks as well, this time without back-dating their Mtimes 
 * get block index from all keepstores 
 * run data manager in "single run" mode 
 * wait for all keepstores to finish working their trash and pull lists (i.e., /status.json reports @status["PullQueue"]["Queued"]==0 && status["PullQueue"]["InProgress"]==0 && ...@) 
 * get block index from all keepstores, make sure nothing has been deleted except the back-dated unreferenced blocks 
 * make API calls to delete some of the collections 
 * reduce replication on some of the collections 
 * run data manager again in "single run" mode 
 * wait for all keepstores to finish working their trash and pull lists 
 * get block index from all keepstores, make sure: 
 ** all blocks appearing in non-deleted collections were not deleted 
 ** all non-recent blocks appearing only in deleted collections were deleted 

 Along the way, the test suite must confirm that the test data includes 
 * some blocks that appear in at least one "non-deleted" and at least one "deleted" collection 
 * some blocks that appear in at least one "deleted" collection and no "non-deleted" collections, and are recent (i.e., written in the "transient" step, and therefore are not garbage) 
 * some blocks that never appeared in any collections, but are recent 
 * some blocks that never appeared in any collections, and are not recent 
 * of course, some blocks that actually get garbage-collected (i.e., the "garbage" set must not be empty!) 

 _Ideally,_ the assessment of whether a block has been "deleted" should compare desired to actual replication level -- this way, the test won't start failing when data manager starts deleting some copies of over-replicated non-garbage blocks. But if this part threatens to be non-trivial, we should defer: better to get the issue at hand done, and adjust when needed. 

 When this is done and we are satisfied it's effective, done, keepstore will should no longer need to force @never_delete=true@. See #6221 

Back