Bug #16267

arvbox build uses arvados-server install

Added by Peter Amstutz over 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Tests
Target version:
Start date:
09/24/2020
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-
Release relationship:
Auto

Subtasks

Task #16321: Review 16267-change-arvbox-depsResolvedWard Vandewege


Related issues

Related to Arvados - Story #16264: Handle R SDK dependencies betterResolved

Related to Arvados - Bug #16958: [arvbox] api server startup is unreliableClosed10/07/2020

Blocked by Arvados - Story #16053: [boot] subcommand to install/update dev and runtime dependenciesResolved03/31/2020

Associated revisions

Revision 733143de
Added by Ward Vandewege about 1 year ago

Merge branch '16267-change-arvbox-deps' into master

closes #16267

Arvados-DCO-1.1-Signed-off-by: Ward Vandewege <>

Revision da8b6f7d (diff)
Added by Ward Vandewege about 1 year ago

Fix arvados-cwl-conformance-tests and arv-federation-migrate-test
jenkins jobs.

refs #16267

Arvados-DCO-1.1-Signed-off-by: Ward Vandewege <>

Revision de961f41 (diff)
Added by Ward Vandewege about 1 year ago

Fix arv-federation-migrate-test jenkins job: update hardcoded arvbox
container paths.

refs #16267

Arvados-DCO-1.1-Signed-off-by: Ward Vandewege <>

History

#1 Updated by Peter Amstutz over 1 year ago

  • Status changed from New to In Progress

#2 Updated by Peter Amstutz over 1 year ago

  • Subject changed from arvbox and jenkins images updated to use arvados-server to arvbox and jenkins images updated to use arvados-server install

#3 Updated by Peter Amstutz over 1 year ago

  • Status changed from In Progress to New

#4 Updated by Peter Amstutz over 1 year ago

  • Blocked by Story #16053: [boot] subcommand to install/update dev and runtime dependencies added

#5 Updated by Peter Amstutz over 1 year ago

  • Related to Story #16264: Handle R SDK dependencies better added

#6 Updated by Peter Amstutz over 1 year ago

  • Category set to Tests

#7 Updated by Peter Amstutz over 1 year ago

  • Assigned To set to Peter Amstutz

#8 Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2020-04-08 Sprint to 2020-04-22

#9 Updated by Peter Amstutz over 1 year ago

  • Subject changed from arvbox and jenkins images updated to use arvados-server install to arvbox build uses arvados-server install

#10 Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2020-04-22 to 2020-05-06 Sprint

#11 Updated by Peter Amstutz over 1 year ago

  • Status changed from New to In Progress

#12 Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2020-05-06 Sprint to 2020-05-20 Sprint

#13 Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2020-05-20 Sprint to 2020-06-03 Sprint

#14 Updated by Peter Amstutz over 1 year ago

  • Target version changed from 2020-06-03 Sprint to 2020-06-17 Sprint

#15 Updated by Peter Amstutz over 1 year ago

  • Target version deleted (2020-06-17 Sprint)

#16 Updated by Ward Vandewege about 1 year ago

  • Target version set to 2020-09-23 Sprint
  • Assigned To changed from Peter Amstutz to Ward Vandewege

#17 Updated by Peter Amstutz about 1 year ago

  • Target version changed from 2020-09-23 Sprint to 2020-10-07 Sprint

#18 Updated by Ward Vandewege about 1 year ago

Ready for review in 154bfc562eafc642cc801f25b3c258e3846633ba on branch 16267-change-arvbox-deps

#19 Updated by Lucas Di Pentima about 1 year ago

I've tried the "dev" mode and it worked great. Some comments:

  • I think we should separate the arvados-server retrieval & build process depending on the type of arvbox image:
    • demo: Get arvados-server from the desired arvados version before installing dependencies.
    • dev: In this case I think we need to use the very latest dev version from the current branch dir the script is being run, right? So, maybe it won’t even be committed to git… do you think we could bind mount the current dir and use that? The arvados dir will be bind mounted anyways when running arvbox so it won’t be necessary to keep an arvados repo copy on the image.
  • In the case of getting the repo at the correct version for "arvbox demo”, we could use a “shallow clone” from Github:
    • Standard clone from git.arvados.org: 1m30s
    • Shallow clone from github (our own git server doesn’t seem to support shallow clones using http transport): 5s
      • Example: git clone --branch 2.0.4 --depth 1 https://github.com/arvados/arvados.git
      • I think in the demo case it doesn’t matter getting the code from github because they’re always based on released versions, right?

#20 Updated by Lucas Di Pentima about 1 year ago

I've checked image sizes and it seems we're storing a bit more things on the new ones:

$ docker images
REPOSITORY            TAG                                        IMAGE ID            CREATED             SIZE
arvados/arvbox-dev    154bfc562eafc642cc801f25b3c258e3846633ba   1d8e8900f385        53 minutes ago      4.62GB
arvados/arvbox-dev    latest                                     1d8e8900f385        53 minutes ago      4.62GB
arvados/arvbox-base   154bfc562eafc642cc801f25b3c258e3846633ba   ffcdcf64abf8        53 minutes ago      4.62GB
arvados/arvbox-base   latest                                     ffcdcf64abf8        53 minutes ago      4.62GB
arvados/arvbox-dev    <none>                                     18e1fabe86f1        9 days ago          2.11GB
arvados/arvbox-dev    1771152da97200b038378666457d18679f4c8cd7   7c96e56a5ada        2 weeks ago         2.21GB
arvados/arvbox-base   1771152da97200b038378666457d18679f4c8cd7   8a442585bfc5        2 weeks ago         2.21GB

#21 Updated by Ward Vandewege about 1 year ago

Lucas Di Pentima wrote:

I've tried the "dev" mode and it worked great. Some comments:

  • I think we should separate the arvados-server retrieval & build process depending on the type of arvbox image:
    • demo: Get arvados-server from the desired arvados version before installing dependencies.
    • dev: In this case I think we need to use the very latest dev version from the current branch dir the script is being run, right? So, maybe it won’t even be committed to git… do you think we could bind mount the current dir and use that? The arvados dir will be bind mounted anyways when running arvbox so it won’t be necessary to keep an arvados repo copy on the image.

Okay, I've done this. It introduces a lot more complexity in the Dockerfile.base - we now use multiple stages (that helps shrink the image), and 'buildkit' (so that we can bind mount the local arvados directory for building arvados-server, during the docker build stage).

We also use a workaround to have conditionals, and I'm on the fence about that one. It allows us to have one Dockerfile.base, but it means that building it for dev or demo will actually build both versions, always. WDYT?

  • In the case of getting the repo at the correct version for "arvbox demo”, we could use a “shallow clone” from Github:
    • Standard clone from git.arvados.org: 1m30s
    • Shallow clone from github (our own git server doesn’t seem to support shallow clones using http transport): 5s
      • Example: git clone --branch 2.0.4 --depth 1 https://github.com/arvados/arvados.git
      • I think in the demo case it doesn’t matter getting the code from github because they’re always based on released versions, right?

Hmm, yeah, and the situation is now worse because we now have to download the tree twice, once for building arvados-server in the base image, and once in the demo Dockerfile.

I haven't switched yet to cloning from github with depth 1, because unfortunately, that seems to only work for branches, not for commit hashes. Do you have thoughts on that? We build the demo image for each commit.

Switching to multistage builds did make the image a lot smaller again:

arvados/arvbox-dev  154bfc562eafc642cc801f25b3c258e3846633ba   6c0e73182012        About a minute ago   3.18GB
arvados/arvbox-dev  latest                                     6c0e73182012        About a minute ago   3.18GB
arvados/arvbox-base 154bfc562eafc642cc801f25b3c258e3846633ba   2b2af74a7182        About a minute ago   3.18GB
arvados/arvbox-base latest                                     2b2af74a7182        About a minute ago   3.18GB
arvados/arvbox-demo 154bfc562eafc642cc801f25b3c258e3846633ba   ca08194305ba        13 minutes ago       8.4GB
arvados/arvbox-demo latest                                     ca08194305ba        13 minutes ago       8.4GB

Ready for another look at b27c53dedf632f614356305bc624befa5477b98e on branch 16267-change-arvbox-deps.

#22 Updated by Lucas Di Pentima about 1 year ago

While trying to start from scratch an instance on dev mode I got this error, seemingly coming from tools/arvbox/lib/arvbox/docker/service/ready/run-service:

lucas@buster:~/arvados$ ARVBOX_CONTAINER=16267 tools/arvbox/bin/arvbox restart publicdev
Public arvbox will use address 10.1.1.7
Cloning into '/home/lucas/.arvbox/16267/arvados'...
Cloning into '/home/lucas/.arvbox/16267/composer'...
remote: Enumerating objects: 36647, done.
remote: Total 36647 (delta 0), reused 0 (delta 0), pack-reused 36647
Receiving objects: 100% (36647/36647), 11.76 MiB | 2.97 MiB/s, done.
Resolving deltas: 100% (25516/25516), done.
Branch 'arvados-fork' set up to track remote branch 'arvados-fork' from 'origin'.
Switched to a new branch 'arvados-fork'
Already up to date.
Cloning into '/home/lucas/.arvbox/16267/workbench2'...
bee88d04fccf9d6eab4a1f6a213a572619f6f652888daed07860779910485367
groupadd: group 'git' already exists

Arvados-in-a-box starting

chown: invalid user: 'arvbox'
chpst: fatal: unknown user/group: arvbox:arvbox:docker
groupadd: group 'arvbox' already exists
Note: if this is a fresh arvbox installation, it may take 10-15 minutes (or longer) to download and
install dependencies. Use "arvbox log" to monitor the progress of specific services.

ssh is ready at 10.1.1.7:22
./run-service: line 52: ARVADOS_CONTAINER_PATH: unbound variable
./run-service: line 52: ARVADOS_CONTAINER_PATH: unbound variable
./run-service: line 52: ARVADOS_CONTAINER_PATH: unbound variable
ls: cannot access '/usr/local/node-*': No such file or directory
debconf: unable to initialize frontend: Dialog
debconf: (TERM is not set, so the dialog frontend is not usable.)
debconf: falling back to frontend: Readline
debconf: unable to initialize frontend: Readline
debconf: (This frontend requires a controlling tty.)
debconf: falling back to frontend: Teletype
./run-service: line 52: ARVADOS_CONTAINER_PATH: unbound variable
./run-service: line 52: ARVADOS_CONTAINER_PATH: unbound variable
./run-service: line 52: ARVADOS_CONTAINER_PATH: unbound variable
./run-service: line 52: ARVADOS_CONTAINER_PATH: unbound variable
...

Regarding your comment about bulding both versions when asking for any one of demo or dev, it wasn't my case, it just built arvados/arvbox-base and arvados/arvbox-dev as before.
As for the use of --depth 1 on demo images, I thought we were releasing stable versions only, but even then, if we use commit hashes I think it won't be easily doable.

#23 Updated by Lucas Di Pentima about 1 year ago

After retrying building the image by pre-seeding the ~/.arvbox/arvbox/arvados/ dir with a copy of this branch, I was able to start a new instance without issue.
This is a change in behaviors from the previous arvbox, but we don't seem to have any of this documented, so there's no documentation to fix :)

LGTM, thanks!

#24 Updated by Ward Vandewege about 1 year ago

Lucas Di Pentima wrote:

After retrying building the image by pre-seeding the ~/.arvbox/arvbox/arvados/ dir with a copy of this branch, I was able to start a new instance without issue.
This is a change in behaviors from the previous arvbox, but we don't seem to have any of this documented, so there's no documentation to fix :)

LGTM, thanks!

Yeah that was a bug, I pushed a fix in a463a62cdef50691f333c5c6f0d2860a542e138a, is that better?

#25 Updated by Lucas Di Pentima about 1 year ago

Yes, I've tested building the image and starting a new instance, all from scratch. It worked great! Thanks, LGTM!

#26 Updated by Ward Vandewege about 1 year ago

  • % Done changed from 0 to 100
  • Status changed from In Progress to Resolved

#27 Updated by Ward Vandewege about 1 year ago

  • Related to Bug #16958: [arvbox] api server startup is unreliable added

#28 Updated by Peter Amstutz about 1 year ago

  • Release set to 25

Also available in: Atom PDF