Feature #8311

[Crunch2] git tree mount points

Added by Tom Clegg over 3 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
-
Target version:
Start date:
12/14/2017
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
2.0

Description

Crunch-run capability to mount the contents of a git repository checkout for a given commit, as documented http://doc.arvados.org/api/methods/containers.html

Note: Spec for git mounts (see Containers API) includes some features we don't need in an initial implementation. For now,
  • Implement "kind", "uuid", and "commit" fields.
  • Fail if "revisions", "git-url", or "repository_name" is provided.
  • Fail if a "path" other than "/" is provided.
  • Fail if "commit" is not a full 40-char sha1.
  • Update documentation to reflect what is actually implemented.

Generally, "commit" in a container request can be a branch, tag, or sha1. "commit" in a container is always a sha1 (it is resolved in when creating the container, like mounts and container image). In this initial version, commit must be given as a sha1 even in the container request.

To minimize dependencies, use pure Go git library instead of using "git" subprocess.


Subtasks

Task #12812: Review 8311-mount-gitResolvedPeter Amstutz


Related issues

Blocks Arvados - Story #12060: [CWL] Run from composerResolved

Associated revisions

Revision 6cae747c
Added by Tom Clegg over 1 year ago

Merge branch '8311-mount-git'

closes #8311

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <>

History

#1 Updated by Tom Morris about 2 years ago

  • Target version set to Arvados Future Sprints
  • Release deleted (11)

#2 Updated by Peter Amstutz about 2 years ago

  • Description updated (diff)

#3 Updated by Peter Amstutz about 2 years ago

  • Description updated (diff)

#4 Updated by Tom Morris almost 2 years ago

  • Story points set to 2.0

#5 Updated by Tom Clegg almost 2 years ago

  • Description updated (diff)

#6 Updated by Peter Amstutz over 1 year ago

  • Target version changed from Arvados Future Sprints to 2017-12-20 Sprint

#7 Updated by Tom Clegg over 1 year ago

  • Assigned To set to Tom Clegg

#8 Updated by Tom Clegg over 1 year ago

  • Description updated (diff)

#9 Updated by Tom Clegg over 1 year ago

  • Status changed from New to In Progress

#10 Updated by Tom Clegg over 1 year ago

Slight complication: you can't necessarily fetch a commit from a remote by commit sha1. The server gets to decide whether to support that, or whether you need to know a tag or branch of which the commit is an ancestor.

We already ran into this on #12107#note-11 and sure enough pulling a commit hash from arv-git-httpd doesn't work, at least in the integration test environment.

Possibilities
  • make arv-git-httpd support fetching by sha1 (perhaps it's easy with a git/git-http-backend/gitolite config)
  • pull/tag the desired commit into the internal.git repo when creating a container (similar to crunch1) and have crunch-job pull the tagged commit from internal.git

#11 Updated by Tom Clegg over 1 year ago

Tom Clegg wrote:

  • make arv-git-httpd support fetching by sha1 (perhaps it's easy with a git/git-http-backend/gitolite config)

This (git 2.5+) seems promising

[uploadpack]
        allowReachableSHA1InWant = true

But we should also handle the more general case where that's not enabled, so enabling that automatically on arvados-hosted repos seems desirable but not a blocker.

#12 Updated by Tom Clegg over 1 year ago

8311-mount-git @ f96fbd63314748dbe389ccb1b5b37f3fae4688ac
  • start arv-git-httpd and nginx proxy earlier in run-tests (used to only bring these up for workbench tests)
  • add git https address to discovery doc
  • todo: advise Upgrading to master
  • arv-git-httpd can access repos by UUID, not just name (otherwise crunch-run would need to look up the name, which seems like a waste)
  • todo: clean up & test this in arv-git-httpd
  • crunch-run can check out a git tree
  • tests for the fetch-and-checkout code, using arv-git-httpd backend (added some fake data for "repository2" and removed some extra cruft from the tarball like sample git hooks)
  • todo: add a stanza in TestSetupMounts, too
  • todo: validate commit actually looks like a sha1
  • defer/avoid?: currently apiserver does not try to stay in sync with crunch-run wrt validating mounts -- it just resolves UUID to PDH for collection mounts and ignores everything else. We could duplicate the crunch-run validations (path=/, commit hash only, etc.) in apiserver so they get rejected when committing a container request instead of waiting until a node comes up. But if we're eager to get this hooked up for a use case where our own code is writing the mounts, maybe we shouldn't block a merge behind this?
  • todo: update docs

#13 Updated by Peter Amstutz over 1 year ago

FAIL: git_mount_test.go:126: GitMountSuite.TestInvalid

git_mount_test.go:160:
    c.Check(err, check.ErrorMatches, trial.matcher)
... value = nil
... regex string = ".*sha1.*" 
... Error value is nil

git_mount_test.go:160:
    c.Check(err, check.ErrorMatches, trial.matcher)
... value = nil
... regex string = ".*UUID.*" 
... Error value is nil

#14 Updated by Tom Clegg over 1 year ago

8311-mount-git @ cb8f4bb6e6d11eff6439822b9d3a3e2763eb61be
  • rebased on master
  • docs updated
  • cleanup + tests for the arv-git-httpd change
  • test git_tree mounts in TestSetupMounts
  • validate mount doesn't try anything unsupported
  • fix an old bug and remove an old debug printf in arv-git-httpd
  • todo?: Both apiserver and crunch-run need to be updated in order for this to work. Does that even deserve to be mentioned on Upgrading to master?
  • todo: fix api tests (broken by changes to test git tarball and run-tests)

#16 Updated by Peter Amstutz over 1 year ago

crunch-run error:

While setting up mounts: git fetch "http://192.168.5.2:9001/2tlax-s0uqq-u6kz3a8x06tczjb.git": unexpected client error: unexpected requesting "http://192.168.5.2:9001/2tlax-s0uqq-u6kz3a8x06tczjb.git/info/refs?service=git-upload-pack" status code: 500

arv-git-httpd error:

2017-12-14_19:49:34.67766 2017/12/14 19:49:34 "192.168.5.2:60622" "&lt;invalid&gt;" 500 "arvados API server error: Not logged in (401: 401 Unauthorized) returned by 192.168.5.2:8000" "2tlax-s0uqq-u6kz3a8x06tczjb" "GET" "/2tlax-s0uqq-u6kz3a8x06tczjb.git/info/refs"

#17 Updated by Peter Amstutz over 1 year ago

Hmm, this looks like it could be a problem, eh?

    err = repo.Fetch(&git.FetchOptions{
        RemoteName: "origin",
        Auth:       git_http.NewBasicAuth("none", arvadostest.ActiveToken),
    })

#18 Updated by Peter Amstutz over 1 year ago

2017-12-14T20:27:15.508185948Z 2017-12-14 20:27:15 cwltool INFO: /usr/bin/arvados-cwl-runner 0bb435a47e427b12fa2351141a22a1ba1e28a49d 1.0.20171010180436, arvados-python-client 0.1.20171010180436, cwltool 1.0.20170928192020
2017-12-14T20:27:15.512156671Z 2017-12-14 20:27:15 cwltool INFO: Resolved '/var/lib/cwl/run/revsort.cwl' to 'file:///var/lib/cwl/run/revsort.cwl'
2017-12-14T20:27:15.512520925Z 2017-12-14 20:27:15 cwltool ERROR: Tool definition failed initialization:
2017-12-14T20:27:15.512520925Z [Errno 13] Permission denied: '/var/lib/cwl/run/revsort.cwl'

#19 Updated by Tom Clegg over 1 year ago

I'm guessing this means the mountpoint isn't readable in the container because the mktemp dir isn't group/world-readable and container UID != crunch-run UID. Working on test & fix for that.

#21 Updated by Peter Amstutz over 1 year ago

This works! Composer 'run' button (12060-run-workflow) now submits container request using git repo mount, and it runs successfully in arvbox.

#22 Updated by Tom Clegg over 1 year ago

8311-mount-git @ 9a7f1bba655da249a0445eefa50e92a8b305b753
  • fixes python tests

#24 Updated by Peter Amstutz over 1 year ago

Except all the tests need to pass, of course.

#25 Updated by Anonymous over 1 year ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF