Idea #3126
closed[API] Support use of anonymous git url (like github https) as repository in jobs.create
Description
Currently, crunch-job
can do this when running local jobs.
- Ported to ruby
- Adjusted to suit API server's use (e.g., the commits should be pulled into the "arvados internal" repository)
- Invoked whenever job's
script_version
orrepository
is set/changed
The default "arvados" repository used in tutorials/examples should be changed to "https://github.com/curoverse/arvados.git" so they don't depend on your site's local mirror of the arvados repository being up-to-date.
Likewise, the arvados_sdk_version
feature should use the github url instead of the locally hosted arvados.git.
Updated by Tom Clegg over 10 years ago
- Target version set to Arvados Future Sprints
Updated by Tom Clegg about 10 years ago
- Subject changed from Support use of anonymous git url (like github https) as job repository to [API] Support use of anonymous git url (like github https) as repository in jobs.create
Updated by Tom Clegg about 10 years ago
- Description updated (diff)
- Category set to API
Updated by Sarah Guthrie almost 10 years ago
bump This would be really useful for me!
Updated by Ward Vandewege almost 10 years ago
- Target version changed from Arvados Future Sprints to 2015-04-01 sprint
Updated by Tom Clegg almost 10 years ago
- Target version changed from 2015-04-01 sprint to 2015-04-29 sprint
Updated by Tom Clegg over 9 years ago
Notes about 3126-remote-git-url:
- The commit+tree needed for a job is now copied into the internal repository during the job create/update transaction, regardless of whether it is hosted locally or remotely.
- crunch-dispatch already checks the "internal" repository before copying objects from (or even looking at) a locally hosted repository, and passes the internal repository path to crunch-job with
--git-dir
so crunch-job uses it to dogit archive
, so we didn't need any changes there. The above point means thegit tag
andgit fetch-pack
code in crunch-dispatch can be removed (except perhaps a sanity check on the git tag) but I figure this might as well wait a bit, to avoid deployment races. We should stop seeing any "fetch-pack" messages in crunch-dispatch logs, which will assure us those bits aren't used any more. git fetch-pack
can't fetch a revision by sha1. I usedgit pack-objects | git unpack-objects
rather than copying the entire remote repository into the internal repo withfetch-pack --all
like crunch-dispatch did.- Using a commit from a remote repository requires copying git objects twice: clone/fetch the remote (with all tags and branches) into a cache dir dedicated to that remote in order to resolve a refspec to a sha1, then copy the necessary objects from the cache dir to the internal git dir. Fetching remote repositories directly into internal.git would have been nice and simple if it weren't for two fatal flaws:
- In order to look up arbitrary revisions, we have to copy remote refs into a local git-dir:
git ls-remote
can look up plain branches likemaster
, but notmaster~3
or78be33b
. Troublesome races would occur if two concurrent transactions fetched two different remote repositories' branches into the same local git-dir and looked up "master". - Even without races, an abbreviated sha1 version in a job submitted by user B could easily resolve to a commit in user A's private repository which was imported for a previous job. Essentially, it's never safe to look up anything in the shared "internal" repository except a full SHA-1 for which the current user's permission has already been established.
- In order to look up arbitrary revisions, we have to copy remote refs into a local git-dir:
Updated by Peter Amstutz over 9 years ago
dispatch: 4n8aq-8i9sb-798lzzemsyw4myr: Repository https://github.com/curoverse/arvados.git not found under /home/peter/work/arvados_prod_repos /home/peter/work/arvados/services/api/app/models/arvados_model.rb:328:in `ensure_permission_to_save': ArvadosModel::PermissionDeniedError (ArvadosModel::PermissionDeniedError)
Is there a newer revision, or should I look into this error to find out what's wrong?
Updated by Tom Clegg over 9 years ago
Ah, sorry -- "we didn't need any changes there" was not true! Fixed, rebased on current master, now at 85378b8.
Updated by Peter Amstutz over 9 years ago
I'm getting API test failures:
2) Failure: CommitTest#test_find_commit_range_uses_fetch_remote_repository_to_get_git://github.com/curoverse/arvados.git [/home/peter/work/arvados/services/api/test/unit/commit_test.rb:35]: Expected [] to not be empty. 3) Failure: CommitTest#test_find_commit_range_uses_fetch_remote_repository_to_get_http://github.com/curoverse/arvados.git [/home/peter/work/arvados/services/api/test/unit/commit_test.rb:35]: Expected [] to not be empty. 419 tests, 1676 assertions, 2 failures, 0 errors, 2 skips
Updated by Tom Clegg over 9 years ago
I installed a new jessie system, had test failures, and fixed (now at de3e44a). But my https:// test failed too (not just http:// and git://) and the fix was a leaked-state-from-previous-test issue, so I'm not certain I've seen the same failure mode as note-11.
Worth a try, though.
(Also: if you get all three failures (git,http,https) with 85378b8 after removing services/api/tmp/git, that would be fairly reassuring that we were seeing the same issue.)
Updated by Tom Clegg over 9 years ago
- Status changed from New to Resolved
- % Done changed from 75 to 100
Applied in changeset arvados|commit:86e078ae126f6651428219c726c34da3bd7f7495.