Feature #15370

[arvados-dispatch-cloud] loopback driver

Added by Tom Clegg almost 3 years ago. Updated 31 minutes ago.

Status:
In Progress
Priority:
Normal
Assigned To:
Category:
Crunch
Target version:
Start date:
05/17/2022
Due date:
% Done:

50%

Estimated time:
(Total: 0.00 h)
Story points:
-

Description

The loopback driver implements cloud.Driver by presenting a fake cloud in which
  • Create() succeeds once, but fails with a quota error if the caller tries to create multiple instances
  • Instances() returns the instance that was created, if any
  • Destroy() makes the instance disappear from the next Instances() result
  • Instance address points to an SSH server (brought up by the driver) that accepts the dispatcher's key and executes shell commands
Using the loopback driver will involve some special configuration.
  • If InstanceTypes is empty, it is automatically configured with a single instance type, with the host's RAM/CPU specs

When combined with #14922 this should make crunch-dispatch-local redundant.

This will also facilitate an arvados-dispatch-cloud integration test that uses the real crunch-run program instead of a stub. This might involve a few other changes, like a configurable location for lockfiles.

It's okay that this will be useless (other than single-container test cases) until #14922 is implemented, because it will also make #14922 easier to test.


Subtasks

Task #19060: Review 15370-loopback-dispatchcloudIn ProgressWard Vandewege

Task #19138: Review 15370-install-dockerResolvedWard Vandewege


Related issues

Related to Arvados - Feature #14922: [crunch-dispatch-cloud] Run multiple containers concurrently on a single VMNew

Related to Arvados - Story #13908: [Epic] Replace SLURM for cloud job scheduling/dispatchingResolved

Related to Arvados - Story #18973: Test combinations of federation scenariosNew

History

#1 Updated by Tom Clegg almost 3 years ago

  • Related to Feature #14922: [crunch-dispatch-cloud] Run multiple containers concurrently on a single VM added

#2 Updated by Tom Clegg almost 3 years ago

  • Related to Story #13908: [Epic] Replace SLURM for cloud job scheduling/dispatching added

#3 Updated by Peter Amstutz about 1 year ago

  • Target version changed from To Be Groomed to 2021-03-31 sprint

#4 Updated by Peter Amstutz about 1 year ago

  • Target version changed from 2021-03-31 sprint to 2021-04-14 sprint

#5 Updated by Peter Amstutz about 1 year ago

  • Description updated (diff)

#6 Updated by Peter Amstutz about 1 year ago

  • Target version changed from 2021-04-14 sprint to 2021-05-26 sprint

#7 Updated by Peter Amstutz about 1 year ago

  • Target version changed from 2021-05-26 sprint to 2021-07-07 sprint

#8 Updated by Peter Amstutz 11 months ago

  • Target version changed from 2021-07-07 sprint to 2021-07-21 sprint

#9 Updated by Peter Amstutz 11 months ago

  • Target version changed from 2021-07-21 sprint to 2021-08-04 sprint

#10 Updated by Peter Amstutz 10 months ago

  • Target version changed from 2021-08-04 sprint to 2021-08-18 sprint

#11 Updated by Peter Amstutz 10 months ago

  • Target version changed from 2021-08-18 sprint to 2021-09-01 sprint

#12 Updated by Peter Amstutz 9 months ago

  • Target version deleted (2021-09-01 sprint)

#13 Updated by Peter Amstutz about 1 month ago

  • Target version set to 2022-04-27 Sprint

#14 Updated by Peter Amstutz about 1 month ago

  • Related to Story #18973: Test combinations of federation scenarios added

#16 Updated by Peter Amstutz about 1 month ago

  • Target version changed from 2022-04-27 Sprint to 2022-05-11 sprint

#17 Updated by Peter Amstutz 23 days ago

  • Assigned To set to Tom Clegg

#18 Updated by Tom Clegg 21 days ago

  • Status changed from New to In Progress

#19 Updated by Tom Clegg 16 days ago

  • Description updated (diff)

15370-loopback-dispatchcloud @ 34b13b1b9cc34661bf0c6774105ae03b412cbbdb -- developer-run-tests: #3085

(tests are failing because CI image doesn't have rsync)

#20 Updated by Tom Clegg 11 days ago

  • Description updated (diff)

#22 Updated by Tom Clegg 9 days ago

  • Target version changed from 2022-05-11 sprint to 2022-05-25 sprint

#24 Updated by Tom Clegg 7 days ago

Now tests are failing because the CI image doesn't have docker, so "arv-keepdocker" doesn't work.

Added docker install recipe to arvados-server install

15370-loopback-dispatchcloud @ f07c059fca954e4d001cbf1cb36c845be9d884dd

#28 Updated by Tom Clegg 3 days ago

#29 Updated by Ward Vandewege 3 days ago

Tom Clegg wrote:

15370-install-docker @ 663f3742a80b1b236d727d2d27068d03a37b4469

LGTM thanks!

#30 Updated by Ward Vandewege 31 minutes ago

Tom Clegg wrote:

15370-loopback-dispatchcloud @ 731c5e81f5aedc82d03786670610bde68bba27c7 -- developer-run-tests: #3146

I updated the jenkins satellite image to incorporate the changes from main, which means docker should now be present. Running these tests again:

developer-run-tests: #3153

Also available in: Atom PDF