Project

General

Profile

Actions

Idea #2752

closed

arv-put can quickly resume an interrupted transfer.

Added by Tom Clegg almost 10 years ago. Updated almost 10 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Story points:
2.0

Description

Two main approaches are available. (Eventually we want both. For now, we want whichever is most accessible/efficient from a development perspective.)

HEAD-before-PUT mode
  • Simple to implement reliably
  • Depends on HEAD actually working (might not be true yet when proxy use case is otherwise ready to go)
  • Does not depend on local filesystem features like inode/ctime
  • Does not depend on arv-put running in the same user account (or even host) each time
  • Still requires re-reading the local data
  • Interacts interestingly with Keep permission mechanism (needs some combination of storing partial manifests and caching permission signatures locally)
Local checkpoint
  • Save state (in $HOME/.cache?) while running.
    • Separate cache per arvados_api_host (don't get confused when uploading to two different sites)
    • Be attentive to race conditions (e.g., refuse to run two resumable transfers that would use the same cached data)
    • List of files in order written
    • For each file (at minimum) store name, inode, ctime, size
    • List of blobs successfully written to Keep (including size)
  • When resuming, skip what's already done (unless --no-resume is given).
    • If blob locators have permission signatures, check their expiry times before deciding to re-use. Warn the user (once per arv-put) if blobs are being re-uploaded for this reason.

Subtasks 1 (0 open1 closed)

Task #2864: Review 2752-arv-put-resume-wipResolvedPeter Amstutz05/26/2014Actions

Related issues

Related to Arvados - Feature #2751: Python SDK behaves appropriately when API server advertises a Keep proxy instead of individual Keep storage serversResolvedPeter Amstutz05/15/2014Actions
Actions

Also available in: Atom PDF