Project

General

Profile

Actions

Feature #15457

open

[Controller] Delegate new container requests to other clusters based on location of input data

Added by Tom Clegg over 5 years ago. Updated 9 months ago.

Status:
New
Priority:
Normal
Assigned To:
-
Category:
API
Target version:
Story points:
3.0
Release:
Release relationship:
Auto

Description

When a client creates a new container request (and doesn't specify a desired cluster ID) controller should resolve all input collection to PDHs as needed, and then:
  • if all inputs are available locally, create a local container request (as the current implementation does in all cases)
  • otherwise, rank local/remote clusters according to how many of the input data bytes they have on hand, and execute a "create CR" request on the highest-ranking cluster -- being sure to specify the chosen cluster ID so the remote cluster doesn't have to repeat the ranking/selection process itself.
    • if the local cluster is tied with a remote, choose the local cluster
    • use the file_size_total collection attribute

At least for now, don't go to too much trouble to be precise -- if a mount only refers to a small file in a large collection, it's OK to rank by the entire collection size.

If a remote cluster returns an error during the "probe for inputs" phase, drop that cluster from the list of candidates.

If a remote cluster returns an error or times out when submitting a container request, fall back to submitting to the local cluster (unless this fallback is disabled via config knob). If this fallback is enabled, the remote call should time out in 1/2 the remaining portion of the locally configured API request timeout (see Deadline()). If the local cluster fails (whether or not a remote has also been attempted), just return the error to the caller.

Add an entry to the CR's properties hash indicating how the cluster was chosen, including any errors encountered when probing or submitting to remotes.


Related issues

Related to Arvados - Bug #14710: [Workbench] Child containers run on federated clusters do not show upNewActions
Actions #1

Updated by Tom Clegg over 5 years ago

  • Related to Bug #14710: [Workbench] Child containers run on federated clusters do not show up added
Actions #2

Updated by Tom Clegg over 5 years ago

  • Description updated (diff)
Actions #3

Updated by Tom Clegg over 5 years ago

  • Description updated (diff)
Actions #4

Updated by Tom Clegg over 5 years ago

  • Description updated (diff)
Actions #5

Updated by Tom Clegg over 5 years ago

  • Description updated (diff)
Actions #6

Updated by Tom Morris over 5 years ago

  • Target version changed from To Be Groomed to Arvados Future Sprints
  • Story points set to 3.0
Actions #7

Updated by Peter Amstutz over 3 years ago

  • Target version deleted (Arvados Future Sprints)
Actions #8

Updated by Peter Amstutz almost 2 years ago

  • Release set to 60
Actions #9

Updated by Peter Amstutz 9 months ago

  • Target version set to Future
Actions

Also available in: Atom PDF