Project

General

Profile

Actions

Feature #8993

closed

arv-put: options for 3 modes of "resumption"

Added by Peter Grandi about 8 years ago. Updated about 4 years ago.

Status:
Closed
Priority:
Normal
Assigned To:
-
Category:
-
Target version:
-
Story points:
-

Description

Because of https://dev.arvados.org/issues/8878 it seems to be that arv-put has 2 modes of operation when uploading hash+block:

  • Without --no-resume, and if the hash+block is listed in the resume list, and the permissions token for it is not expired, neither hash nor block are uploaded, they are presume to be present in Keep, and simply added to the upload manifest.
  • Otherwise, both hash and block are uploaded and it is Keep's job to avoid unnecessarily duplicating them.

The problem with with the first case is that is the resume list is assumed to be a valid cache (until permission token expiry), while instead arguably it should be treated as a hint and verified before use.

The problem with the second case is that the whole block is uploaded, consuming resources, even if Keep then determines it is already present.

This request is for a 3rd case and a different default, for example in the following form, with 3 values for a new option --upload-again:

  • yes with the same meaning as current --no-resume, that is unconditionally upload all hashes and their blocks.
  • no with a similar (or even identical) meaning as current --resume, that is upload hashes and blocks only if they are not mentioned in whichever already-uploaded hash list is available.
  • check with a new meaning, to send to all Keepstore daemons the list of hashes to be uploaded (possibly in subsets), which then return a list of those that are found present, with an absolute expiry time, and then to upload all other hashes and blocks, and at the end upload the hashes and blocks in the returned list only if their lifetime has expired, and then write the manifest. Or some obvious variant.

The default would be --upload-again=yes for safety, with check recommended and no suggested only for "optimistic" cases.

The main flaw of the current option is that it hold outside Keep block state that is persistent and does not get verified, even if the permission expiry time is a block lifetime time only advisorily.


Related issues

Related to Arvados - Idea #8937: [SDKs] Write integration test for when arv-put resumes from a cache with expired access tokensResolvedLucas Di Pentima04/13/2016Actions
Related to Arvados - Bug #8878: Keep: sudden appearance of "missing" blocksClosed04/04/2016Actions
Actions

Also available in: Atom PDF