Actions
Idea #19592
openAssigning a portable data hash to a project tree & project export/import
Status:
New
Priority:
Normal
Assigned To:
-
Target version:
-
Start date:
01/01/2025
Due date:
06/30/2025 (Due in about 6 months)
Story points:
-
Release:
Release relationship:
Auto
Description
Vision:
Arvados projects as "packages": a bundle of data and code to which a version can be assigned, and copies distributed far and wide to other Arvados instances. Users are able to track that they used a specific X.Y.Z version (also identified by immutable hash) of a package.
A history of package versions is kept, and it must be possible to reference or go back to earlier versions, as well as determine what changed between two versions.
Initial design thoughts:
- Compute a hash for an entire project contents, including collections, subprojects, workflows, container requests, and containers
- Could be built on computing data hashes for records that cover the majority of the record contents, including metadata such as creation/last modified time.
- Maintain a history of project versions
- Copy a project to another cluster and compute a content hash that confirms that the content is the same
- Determine what changed between two versions of a project
- Apply a set of changes that were made to one copy of a project, to another copy
- Export the project to a file system hierarchy, and re-import the project later
Actions