Idea #2752
Updated by Tom Clegg over 10 years ago
Two main approaches are available. (Eventually we want both. For now, we want whichever is most accessible/efficient from a development perspective.) approaches: HEAD-before-PUT mode * Simple to implement reliably * Depends on HEAD actually working (might not be true yet when proxy use case is otherwise ready to go) * Does not depend on local filesystem features like inode/ctime * Does not depend on arv-put running in the same user account (or even host) each time * Still requires re-reading the local data * Interacts interestingly with Keep permission mechanism (needs some combination of storing partial manifests and caching permission signatures locally) Local checkpoint * Save state (in @$HOME/.cache@?) while running. ** Separate cache per arvados_api_host (don't get confused when uploading to two different sites) ** Be attentive to race conditions (e.g., refuse to run two resumable transfers that would use the same cached data) ** List of files in order written ** For each file (at minimum) store name, inode, ctime, size ** List of blobs successfully written to Keep (including size) * When resuming, skip what's already done (unless @--no-resume@ is given). ** If blob locators have permission signatures, check their expiry times before deciding to re-use. Warn the user (once per arv-put) if blobs are being re-uploaded for this reason.