Idea #22458
openAbility to intentionally turn a collection a "ghost" collection
Description
For provenance, I would like to keep collection records around.
However, in some cases I don't want to store the intermediate data. For example, I might have processing steps where the output is just as large or larger than the input data.
Propose being able to set replication_desired
to zero to indicate that the underlying blocks can be trashed by keep-balance, without them being reported as "missing" blocks. Once set to zero, replication_desired
cannot be increased. I call these "ghost collections".
Fetching a ghost collection returns an unsigned manifest.
Ghost collection records should behave similarly to frozen projects: read-only, except for being moved between projects (it might be ok to edit metadata such as name and properties as well).
Similar to trash_at
/ delete_at
, it would also be nice to have a ghost_at
field, and a corresponding output_ghost_ttl
on container requests that lets you specify that a collection should be ghosted at some point in the future -- helpful to keep intermediate results around for a little while, but not forever.
Clients such as Workbench, keep-web, Python SDK, etc should be made aware of ghost collections, so that they return a sensible error if the user tries to read a file, instead of a scary "failed to read block" error.
Updated by Peter Amstutz 4 days ago
- Description updated (diff)
- Subject changed from Ability to intentional make a collection a "ghost" collection to Ability to intentionally turn a collection a "ghost" collection
Updated by Peter Amstutz 4 days ago
- Related to Idea #22459: Manual "empty trash" command added