Bug #6772

Updated by Tom Clegg over 7 years ago

h3. Background 

 Currently, gitolite and arv-git-httpd must be installed on the same node as the API server. 

 This contributes to the undesirable rule that a site can have only one host running an API server. 

 h3. Implementation 

 When arvados-git-httpd is in use, hosted Hosted repositories should could be treated the same way as remote repositories in source:services/api/app/models/commit.rb: i.e., when validating a job submission, fetch the repository from arvados-git-httpd and put it in the local cache. 

 Extend @Commit.git_dir_for@ to return remote=true for locally hosted repos when @config.git_repo_https_base@ is not @false@. 

 Extend @Commit.fetch_remote_repository@ 
 * call @remote_url?@ to decide whether this is a local repo 
 * if so, look up This removes the repo current restriction that gitolite and call @https_clone_url@ to get arv-git-httpd must be installed on the remote URL 
 * update @Commit.must_git@ to accept a "use token same node as credential?" argument and set up the git credentials accordingly, using an env var and a credential helper as in source:services/arv-git-httpd/integration_test.go (but presumably it would API server. It also helps remove the current restriction that there can be more race-safe to specify the helper with a command line argument instead of running @git config@ to edit a config file) 

 If the only one API server config has @git_repo_https_base: false@ the previous behavior should continue to work. server. 

 h3. Optional optimization 

 *TBD:* With the naïve approach, this model, when arv-git-httpd authenticates, there will be three HTTP connections open: client->API->arv-git-httpd->API. This could be avoided or minimized by having arv-git-httpd bypass per-repository authentication entirely when given a special pre-shared secret token (similar to keepstore's data manager token), or by having it cache credentials and _sometimes_ skip API lookups when fetching a repo by UUID.