Bug #6772


[API] Should not be necessary to host git repos on the same host as API server

Added by Tom Clegg almost 9 years ago. Updated 5 months ago.

Assigned To:
Target version:
Story points:
Release relationship:



Currently, gitolite and arv-git-httpd must be installed on the same node as the API server.

This contributes to the undesirable rule that a site can have only one host running an API server.


When arvados-git-httpd is in use, hosted repositories should be treated the same way as remote repositories in source:services/api/app/models/commit.rb: i.e., when validating a job submission, fetch the repository from arvados-git-httpd and put it in the local cache.

Extend Commit.git_dir_for to return remote=true for locally hosted repos when config.git_repo_https_base is not false.

Extend Commit.fetch_remote_repository
  • call remote_url? to decide whether this is a local repo
  • if so, look up the repo and call https_clone_url to get the remote URL
  • update Commit.must_git to accept a "use token as credential?" argument and set up the git credentials accordingly, using an env var and a credential helper as in source:services/arv-git-httpd/integration_test.go (but presumably it would be more race-safe to specify the helper with a command line argument instead of running git config to edit a config file)

If the API server config has git_repo_https_base: false the previous behavior should continue to work.

Optional optimization

With the naïve approach, when arv-git-httpd authenticates, there will be three HTTP connections open: client->API->arv-git-httpd->API. This could be avoided or minimized by having arv-git-httpd bypass per-repository authentication entirely when given a special pre-shared secret token (similar to keepstore's data manager token), or by having it cache credentials and sometimes skip API lookups when fetching a repo by UUID.

Actions #1

Updated by Tom Clegg almost 9 years ago

  • Description updated (diff)
Actions #2

Updated by Brett Smith over 8 years ago

This got a big bump in priority because doing so is expected to enable us to deploy multiple Rails servers, improving the reliability of the entire cluster.

Actions #3

Updated by Tom Clegg over 7 years ago

  • Description updated (diff)
Actions #4

Updated by Tom Morris over 7 years ago

  • Assigned To set to Tom Morris
Actions #5

Updated by Tom Morris over 7 years ago

  • Assigned To deleted (Tom Morris)
Actions #6

Updated by Peter Amstutz over 7 years ago

From arv-copy, setting up token credentials entirely on the command line:

            git_config = ["-c", "credential.%s/.username=none" % baseurl,
                          "-c", "credential.%s/.helper=!cred(){ cat >/dev/null; if [ \"$1\" = get ]; then echo password=$ARVADOS_API_TOKEN; fi; };cred" % baseurl]
Actions #7

Updated by Ward Vandewege about 3 years ago

  • Target version deleted (Arvados Future Sprints)
Actions #8

Updated by Peter Amstutz over 1 year ago

  • Release set to 60
Actions #9

Updated by Peter Amstutz 5 months ago

  • Target version set to Future

Also available in: Atom PDF