Project

General

Profile

Actions

Idea #6096

closed

[OPS] Implement a process to regularly deploy a Docker image for running jobs to Arvados clusters

Added by Brett Smith almost 9 years ago. Updated almost 9 years ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Story points:
0.5

Description

We're now in a place where we install all components via debs.

There's a script in arvados-dev called

jenkins/run-deploy.sh

That script should be updated to make sure the default jobs image that matches the deployment is installed.

This is the line that needs to be added:

arv keep docker --project-uuid=<public docker image project> --name="Docker image for compute nodes $TAG" TAG

where TAG is the git_hash (that will be PINNED in hiera!)

So we need to create a public project for the arvados docker images on each installation.

Another todo: making that project needs to be a part of the install instructions. I think we should add that here:

http://doc.arvados.org/install/create-standard-objects.html

Subtasks 10 (0 open10 closed)

Task #6237: create debian package for WorkbenchResolvedWard Vandewege05/22/2015Actions
Task #6586: figure out arv-keepdocker permissions issue on 4xphqResolvedWard Vandewege05/22/2015Actions
Task #6528: split off arvados/jobs docker image from docker imageResolvedWard Vandewege05/22/2015Actions
Task #6569: clean up and test arvados/jobs imageResolvedWard Vandewege05/22/2015Actions
Task #6119: Review 6096-deploy-jobs-imageResolvedWard Vandewege06/15/2015Actions
Task #6236: create debian package for API serverResolvedWard Vandewege05/22/2015Actions
Task #6241: review 6096-package-rails-appsResolvedWard Vandewege05/22/2015Actions
Task #6573: review 6569-smarter-jobs-imageResolvedTom Clegg05/22/2015Actions
Task #6136: review 6135-docker-git-tagResolvedNico César05/22/2015Actions
Task #6135: 'docker' jenkins jobs has to tag with the git revision on pushResolvedNico César05/22/2015Actions

Related issues

Blocks Arvados - Feature #6348: [Deployment] [Documentation] Minimize system-wide dependencies for compute node setupNewActions
Precedes (1 day) Arvados - Bug #5990: [SDKs] arv-run defaults to using arvados/jobs, without checking that it exists or is recentResolvedPeter Amstutz05/25/201505/25/2015Actions
Actions #1

Updated by Nico César almost 9 years ago

  • Assigned To set to Nico César

mental note: this is a arv-keep docker thing

Actions #2

Updated by Nico César almost 9 years ago

  • Status changed from New to In Progress

As far as I understand this is as simple as:

 arv keep docker arvados/jobs

My question is where should this be excecuted and why. Anyone could help me?

Actions #3

Updated by Brett Smith almost 9 years ago

Nico Cesar wrote:

As far as I understand this is as simple as:

arv keep docker arvados/jobs

My question is where should this be excecuted and why. Anyone could help me?

You've got the right basic idea. A few details to consider from here:

  • We may want to be able to specify a particular version of the arvados/jobs image to deploy. arv keep docker can accept an image hash as input… but whether or not that's doable will depend on where you're getting the arvados/jobs image from. If you're planning on pulling from the public Docker registry, you'll need to check that you can either fetch by image hash, or specify unique version tags during the build+push process that can be retrieved later.
  • We'll need to save the image this way to different clusters.
  • You'll need to figure out a way to make sure all Arvados users on the cluster can see the image. This is easy enough to do if you put it in a public project (arv keep docker accepts a --project-uuid option), but you'll need to know the UUID of that project on each cluster.

So you have a sort of balancing act here. The closer you run to Jenkins (where arvados/jobs gets run), the easier it is to be sure you're working with the right image, but the more work you'll have to do to orchestrate client uploads. If you orchestrate this saving to happen somewhere on each cluster, you have less configuration to worry about in each client, but you'll have to be more careful to ensure each cluster gets the right image.

In general, running on each cluster seems to fit within our existing infrastructure better… except I'm not really sure where you would run the process. But does this at least illuminate the trade-offs for you to chew on more, at least?

Actions #4

Updated by Nico César almost 9 years ago

Ok .. I understand that we can do a new Jenkins job that triggers this update.

I see from "docker" Job:


********** upload arvados/jobs image **********

The push refers to a repository [arvados/jobs] (len: 1)
c1614e5dfdfe: Buffering to Disk
c1614e5dfdfe: Image successfully pushed
c1614e5dfdfe: Image already exists
12a071038bfb: Buffering to Disk
12a071038bfb: Image successfully pushed
b98aeeb7d234: Buffering to Disk
b98aeeb7d234: Image successfully pushed
5793765dfc54: Buffering to Disk
5793765dfc54: Image successfully pushed
abad26f56450: Buffering to Disk
abad26f56450: Image successfully pushed
c78304a261ed: Buffering to Disk
c78304a261ed: Image successfully pushed
3325980672f3: Image already exists
8ce15197d12a: Image already exists
59bc1380e0a6: Image already exists
79ace1046749: Image already exists
df9ac9bc06e6: Image already exists
a74ae6b4dab6: Image already exists
1b430bab60ed: Image already exists
e3551d68778e: Image already exists
Digest: sha256:473ac041771ecf2b0e22d0ef42f650764cc42d82d515746d9d3573da9fa9a7d1

I also see that run-deploy.sh has

ssh -p2222 root@$IDENTIFIER.arvadosapi.com -C "/...

I think I could do a combination of both inside run-docker-test.sh with a flag (something like --update-clusters qr1hi,4xphq,9tee4 ) and use the sha256 that it's returned by the docker push

Opinions on this?

Actions #5

Updated by Nico César almost 9 years ago

mental note:

from #3847

arv keep docker --project-uuid=qr1hi-j7d0g-593lq8oed0gymt3 --name="Docker image for compute nodes $TAG" TAG

where TAG is the git_hash (that will be PINNED in hiera!)

Brett, how do I obtain the project-uuid's to apply?

Actions #6

Updated by Nico César almost 9 years ago

see #6135 . docker images will have tags

Actions #7

Updated by Nico César almost 9 years ago

docker images now HAVE tags.

see: https://registry.hub.docker.com/u/arvados/jobs/tags/manage/

you can already see:

178d3f36265e0e9e9cc0bb6ac8c7c47a9c701687
1ec1d552c77e18e2912e400ae395ca00f4e51c3c
7a53d874994a5a9af273cee1329d9635b7e03edb
9413eb733015601af699f2027d9a7a5bad3f3dea

Actions #8

Updated by Nico César almost 9 years ago

I reviewed 381b79bf5cfefe790bdcc24dd33296b8518e4c19 6096-package-rails-apps

looks good to me. Let's merge

Actions #9

Updated by Ward Vandewege almost 9 years ago

Cool, merged 6096-package-rails-apps into arvados-dev.

Actions #10

Updated by Ward Vandewege almost 9 years ago

  • Description updated (diff)
Actions #11

Updated by Ward Vandewege almost 9 years ago

  • Description updated (diff)
Actions #12

Updated by Radhika Chippada almost 9 years ago

  • Target version changed from 2015-06-10 sprint to 2015-07-08 sprint
Actions #13

Updated by Nico César almost 9 years ago

I'm checking 2b93735fc87e447301afafe6556d8571afef2bcf

I see that there are 3 commits on 6096-deploy-jobs-image:
2b93735fc87e447301afafe6556d8571afef2bcf
6f1a22656665643dbba71e59099171d69554b2ad
c06542419737cddd5adda84d4b14e0b88912d0f7

that are related to rpm packages from FPM ... is that needed for this ticket? Anyways, I went through the code,

  • It assumes that "ssh shell.$IDENTIFIER" remote excecution and does /usr/local/rvm/bin/rvm-exec.... which user in the other end will be executing this? what's your ssh config? the same with ssh $IDENTIFIER cat /usr/local/arvados/src/git-commit.version
    • Host *.qr1hi
      ProxyCommand ssh turnout@switchyard.qr1hi.arvadosapi.com $SSH_PROXY_FLAGS %h
Actions #14

Updated by Ward Vandewege almost 9 years ago

Nico Cesar wrote:

I'm checking 2b93735fc87e447301afafe6556d8571afef2bcf

I see that there are 3 commits on 6096-deploy-jobs-image:
2b93735fc87e447301afafe6556d8571afef2bcf
6f1a22656665643dbba71e59099171d69554b2ad
c06542419737cddd5adda84d4b14e0b88912d0f7

that are related to rpm packages from FPM ... is that needed for this ticket?

I should have made a separate branch for these, sorry. Only tangentially related.

Anyways, I went through the code,

  • It assumes that "ssh shell.$IDENTIFIER" remote excecution and does /usr/local/rvm/bin/rvm-exec.... which user in the other end will be executing this? what's your ssh config? the same with ssh $IDENTIFIER cat /usr/local/arvados/src/git-commit.version
    • [...]

All this code runs as you - the user who runs the deploy script.

Actions #15

Updated by Brett Smith almost 9 years ago

  • Target version changed from 2015-07-08 sprint to 2015-07-22 sprint
Actions #16

Updated by Ward Vandewege almost 9 years ago

  • Assigned To changed from Nico César to Ward Vandewege
  • Story points changed from 2.0 to 0.5
Actions #17

Updated by Ward Vandewege almost 9 years ago

  • Subject changed from [OPS] Implement a process to regularly deploy the a Docker image for running jobs to Arvados clusters to [OPS] Implement a process to regularly deploy a Docker image for running jobs to Arvados clusters
Actions #18

Updated by Ward Vandewege almost 9 years ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF