Idea #22458
openAbility to intentionally turn a collection a "ghost" collection
Description
For provenance, I would like to keep collection records around.
However, in some cases I don't want to store the intermediate data. For example, I might have processing steps where the output is just as large or larger than the input data.
Propose being able to set replication_desired
to zero to indicate that the underlying blocks can be trashed by keep-balance, without them being reported as "missing" blocks. Once set to zero, replication_desired
cannot be increased. I call these "ghost collections".
(Another name that just came to me is "dehydrated" or "freeze dried" collections).
Fetching a ghost collection returns an unsigned manifest.
Ghost collection records should behave similarly to frozen projects: read-only, except for being moved between projects (it might be ok to edit metadata such as name and properties as well).
Similar to trash_at
/ delete_at
, it would also be nice to have a ghost_at
field, and a corresponding output_ghost_ttl
on container requests that lets you specify that a collection should be ghosted at some point in the future -- helpful to keep intermediate results around for a little while, but not forever.
Clients such as Workbench, keep-web, Python SDK, etc should be made aware of ghost collections, so that they return a sensible error if the user tries to read a file, instead of a scary "failed to read block" error.
If the ghost collection exists on another cluster readable by the user, it should be possible to automatically fetch the blocks via federation, or rematerialize/rehydrate the collection by downloading all the blocks from somewhere else and re-writing the manifest with current block signatures as proof the collection is readable again.
Updated by Peter Amstutz 3 months ago
- Description updated (diff)
- Subject changed from Ability to intentional make a collection a "ghost" collection to Ability to intentionally turn a collection a "ghost" collection
Updated by Peter Amstutz 3 months ago
- Related to Idea #22459: Manual "empty trash" command added
Updated by Peter Amstutz 3 days ago
- Target version changed from Future to Development 2025-05-14
Updated by Peter Amstutz 3 days ago
- Target version changed from Development 2025-05-14 to Development 2025-05-28
Updated by Peter Amstutz 3 days ago
- Target version changed from Development 2025-05-28 to Development 2025-05-14