Idea #2752
Updated by Tom Clegg over 10 years ago
Two main approaches:
HEAD-before-PUT mode
* Simple to implement reliably
* Depends on HEAD actually working (might not be true yet when proxy use case is otherwise ready to go)
* Does not depend on local filesystem features like inode/ctime
* Does not depend on arv-put running in the same user account (or even host) each time
* Still requires re-reading the local data
Local checkpoint
* Save state (in @$HOME/.cache@?) while running.
** Separate cache per arvados_api_host (don't get confused when uploading to two different sites)
** Be attentive to race conditions (e.g., refuse to run two resumable transfers that would use the same cached data)
** List of files in order written
** For each file (at minimum) store name, inode, ctime, size
** List of blobs successfully written to Keep (including size)
* When resuming, skip what's already done (unless @--no-resume@ is given).
** If blob locators have permission signatures, check their expiry times before deciding to re-use. Warn the user (once per arv-put) if blobs are being re-uploaded for this reason.