Idea #6096
closed[OPS] Implement a process to regularly deploy a Docker image for running jobs to Arvados clusters
Description
We're now in a place where we install all components via debs.
There's a script in arvados-dev called
jenkins/run-deploy.sh
That script should be updated to make sure the default jobs image that matches the deployment is installed.
This is the line that needs to be added:
arv keep docker --project-uuid=<public docker image project> --name="Docker image for compute nodes $TAG" TAG
where TAG is the git_hash (that will be PINNED in hiera!)
So we need to create a public project for the arvados docker images on each installation.
Another todo: making that project needs to be a part of the install instructions. I think we should add that here:
http://doc.arvados.org/install/create-standard-objects.html
Updated by Nico César over 9 years ago
- Assigned To set to Nico César
mental note: this is a arv-keep docker thing
Updated by Nico César over 9 years ago
- Status changed from New to In Progress
As far as I understand this is as simple as:
arv keep docker arvados/jobs
My question is where should this be excecuted and why. Anyone could help me?
Updated by Brett Smith over 9 years ago
Nico Cesar wrote:
As far as I understand this is as simple as:
arv keep docker arvados/jobsMy question is where should this be excecuted and why. Anyone could help me?
You've got the right basic idea. A few details to consider from here:
- We may want to be able to specify a particular version of the arvados/jobs image to deploy.
arv keep docker
can accept an image hash as input… but whether or not that's doable will depend on where you're getting the arvados/jobs image from. If you're planning on pulling from the public Docker registry, you'll need to check that you can either fetch by image hash, or specify unique version tags during the build+push process that can be retrieved later. - We'll need to save the image this way to different clusters.
- You'll need to figure out a way to make sure all Arvados users on the cluster can see the image. This is easy enough to do if you put it in a public project (
arv keep docker
accepts a--project-uuid
option), but you'll need to know the UUID of that project on each cluster.
So you have a sort of balancing act here. The closer you run to Jenkins (where arvados/jobs gets run), the easier it is to be sure you're working with the right image, but the more work you'll have to do to orchestrate client uploads. If you orchestrate this saving to happen somewhere on each cluster, you have less configuration to worry about in each client, but you'll have to be more careful to ensure each cluster gets the right image.
In general, running on each cluster seems to fit within our existing infrastructure better… except I'm not really sure where you would run the process. But does this at least illuminate the trade-offs for you to chew on more, at least?
Updated by Nico César over 9 years ago
Ok .. I understand that we can do a new Jenkins job that triggers this update.
I see from "docker" Job:
********** upload arvados/jobs image ********** The push refers to a repository [arvados/jobs] (len: 1) c1614e5dfdfe: Buffering to Disk c1614e5dfdfe: Image successfully pushed c1614e5dfdfe: Image already exists 12a071038bfb: Buffering to Disk 12a071038bfb: Image successfully pushed b98aeeb7d234: Buffering to Disk b98aeeb7d234: Image successfully pushed 5793765dfc54: Buffering to Disk 5793765dfc54: Image successfully pushed abad26f56450: Buffering to Disk abad26f56450: Image successfully pushed c78304a261ed: Buffering to Disk c78304a261ed: Image successfully pushed 3325980672f3: Image already exists 8ce15197d12a: Image already exists 59bc1380e0a6: Image already exists 79ace1046749: Image already exists df9ac9bc06e6: Image already exists a74ae6b4dab6: Image already exists 1b430bab60ed: Image already exists e3551d68778e: Image already exists Digest: sha256:473ac041771ecf2b0e22d0ef42f650764cc42d82d515746d9d3573da9fa9a7d1
I also see that run-deploy.sh has
ssh -p2222 root@$IDENTIFIER.arvadosapi.com -C "/...
I think I could do a combination of both inside run-docker-test.sh with a flag (something like --update-clusters qr1hi,4xphq,9tee4 ) and use the sha256 that it's returned by the docker push
Opinions on this?
Updated by Nico César over 9 years ago
mental note:
from #3847
arv keep docker --project-uuid=qr1hi-j7d0g-593lq8oed0gymt3 --name="Docker image for compute nodes $TAG" TAG
where TAG is the git_hash (that will be PINNED in hiera!)
Brett, how do I obtain the project-uuid's to apply?
Updated by Nico César over 9 years ago
docker images now HAVE tags.
see: https://registry.hub.docker.com/u/arvados/jobs/tags/manage/
you can already see:
178d3f36265e0e9e9cc0bb6ac8c7c47a9c701687
1ec1d552c77e18e2912e400ae395ca00f4e51c3c
7a53d874994a5a9af273cee1329d9635b7e03edb
9413eb733015601af699f2027d9a7a5bad3f3dea
Updated by Nico César over 9 years ago
I reviewed 381b79bf5cfefe790bdcc24dd33296b8518e4c19 6096-package-rails-apps
looks good to me. Let's merge
Updated by Ward Vandewege over 9 years ago
Cool, merged 6096-package-rails-apps into arvados-dev.
Updated by Radhika Chippada over 9 years ago
- Target version changed from 2015-06-10 sprint to 2015-07-08 sprint
Updated by Nico César over 9 years ago
I'm checking 2b93735fc87e447301afafe6556d8571afef2bcf
I see that there are 3 commits on 6096-deploy-jobs-image:
2b93735fc87e447301afafe6556d8571afef2bcf
6f1a22656665643dbba71e59099171d69554b2ad
c06542419737cddd5adda84d4b14e0b88912d0f7
that are related to rpm packages from FPM ... is that needed for this ticket? Anyways, I went through the code,
- It assumes that "ssh shell.$IDENTIFIER" remote excecution and does /usr/local/rvm/bin/rvm-exec.... which user in the other end will be executing this? what's your ssh config? the same with ssh $IDENTIFIER cat /usr/local/arvados/src/git-commit.version
Host *.qr1hi ProxyCommand ssh turnout@switchyard.qr1hi.arvadosapi.com $SSH_PROXY_FLAGS %h
Updated by Ward Vandewege over 9 years ago
Nico Cesar wrote:
I'm checking 2b93735fc87e447301afafe6556d8571afef2bcf
I see that there are 3 commits on 6096-deploy-jobs-image:
2b93735fc87e447301afafe6556d8571afef2bcf
6f1a22656665643dbba71e59099171d69554b2ad
c06542419737cddd5adda84d4b14e0b88912d0f7that are related to rpm packages from FPM ... is that needed for this ticket?
I should have made a separate branch for these, sorry. Only tangentially related.
Anyways, I went through the code,
- It assumes that "ssh shell.$IDENTIFIER" remote excecution and does /usr/local/rvm/bin/rvm-exec.... which user in the other end will be executing this? what's your ssh config? the same with ssh $IDENTIFIER cat /usr/local/arvados/src/git-commit.version
- [...]
All this code runs as you - the user who runs the deploy script.
Updated by Brett Smith over 9 years ago
- Target version changed from 2015-07-08 sprint to 2015-07-22 sprint
Updated by Ward Vandewege over 9 years ago
- Assigned To changed from Nico César to Ward Vandewege
- Story points changed from 2.0 to 0.5
Updated by Ward Vandewege over 9 years ago
- Subject changed from [OPS] Implement a process to regularly deploy the a Docker image for running jobs to Arvados clusters to [OPS] Implement a process to regularly deploy a Docker image for running jobs to Arvados clusters
Updated by Ward Vandewege over 9 years ago
- Status changed from In Progress to Resolved