Idea #3126
closed
[API] Support use of anonymous git url (like github https) as repository in jobs.create
Added by Tom Clegg over 10 years ago.
Updated over 9 years ago.
Description
Currently, crunch-job
can do this when running local jobs.
That "fetch git objects from remote repository" logic needs to be
- Ported to ruby
- Adjusted to suit API server's use (e.g., the commits should be pulled into the "arvados internal" repository)
- Invoked whenever job's
script_version
or repository
is set/changed
The default "arvados" repository used in tutorials/examples should be changed to "https://github.com/curoverse/arvados.git" so they don't depend on your site's local mirror of the arvados repository being up-to-date.
Likewise, the arvados_sdk_version
feature should use the github url instead of the locally hosted arvados.git.
- Target version set to Arvados Future Sprints
- Subject changed from Support use of anonymous git url (like github https) as job repository to [API] Support use of anonymous git url (like github https) as repository in jobs.create
- Description updated (diff)
- Category set to API
bump This would be really useful for me!
- Target version changed from Arvados Future Sprints to 2015-04-01 sprint
- Target version changed from 2015-04-01 sprint to 2015-04-29 sprint
- Assigned To set to Tom Clegg
Notes about 3126-remote-git-url:
- The commit+tree needed for a job is now copied into the internal repository during the job create/update transaction, regardless of whether it is hosted locally or remotely.
- crunch-dispatch already checks the "internal" repository before copying objects from (or even looking at) a locally hosted repository, and passes the internal repository path to crunch-job with
--git-dir
so crunch-job uses it to do git archive
, so we didn't need any changes there. The above point means the git tag
and git fetch-pack
code in crunch-dispatch can be removed (except perhaps a sanity check on the git tag) but I figure this might as well wait a bit, to avoid deployment races. We should stop seeing any "fetch-pack" messages in crunch-dispatch logs, which will assure us those bits aren't used any more.
git fetch-pack
can't fetch a revision by sha1. I used git pack-objects | git unpack-objects
rather than copying the entire remote repository into the internal repo with fetch-pack --all
like crunch-dispatch did.
- Using a commit from a remote repository requires copying git objects twice: clone/fetch the remote (with all tags and branches) into a cache dir dedicated to that remote in order to resolve a refspec to a sha1, then copy the necessary objects from the cache dir to the internal git dir. Fetching remote repositories directly into internal.git would have been nice and simple if it weren't for two fatal flaws:
- In order to look up arbitrary revisions, we have to copy remote refs into a local git-dir:
git ls-remote
can look up plain branches like master
, but not master~3
or 78be33b
. Troublesome races would occur if two concurrent transactions fetched two different remote repositories' branches into the same local git-dir and looked up "master".
- Even without races, an abbreviated sha1 version in a job submitted by user B could easily resolve to a commit in user A's private repository which was imported for a previous job. Essentially, it's never safe to look up anything in the shared "internal" repository except a full SHA-1 for which the current user's permission has already been established.
78be33b
dispatch: 4n8aq-8i9sb-798lzzemsyw4myr: Repository https://github.com/curoverse/arvados.git not found under /home/peter/work/arvados_prod_repos
/home/peter/work/arvados/services/api/app/models/arvados_model.rb:328:in `ensure_permission_to_save': ArvadosModel::PermissionDeniedError (ArvadosModel::PermissionDeniedError)
Is there a newer revision, or should I look into this error to find out what's wrong?
Ah, sorry -- "we didn't need any changes there" was not true! Fixed, rebased on current master, now at 85378b8.
I'm getting API test failures:
2) Failure:
CommitTest#test_find_commit_range_uses_fetch_remote_repository_to_get_git://github.com/curoverse/arvados.git [/home/peter/work/arvados/services/api/test/unit/commit_test.rb:35]:
Expected [] to not be empty.
3) Failure:
CommitTest#test_find_commit_range_uses_fetch_remote_repository_to_get_http://github.com/curoverse/arvados.git [/home/peter/work/arvados/services/api/test/unit/commit_test.rb:35]:
Expected [] to not be empty.
419 tests, 1676 assertions, 2 failures, 0 errors, 2 skips
I installed a new jessie system, had test failures, and fixed (now at de3e44a). But my https:// test failed too (not just http:// and git://) and the fix was a leaked-state-from-previous-test issue, so I'm not certain I've seen the same failure mode as note-11.
Worth a try, though.
(Also: if you get all three failures (git,http,https) with 85378b8 after removing services/api/tmp/git, that would be fairly reassuring that we were seeing the same issue.)
Tests are passing now, de3e44a LGTM
- Status changed from New to Resolved
- % Done changed from 75 to 100
Applied in changeset arvados|commit:86e078ae126f6651428219c726c34da3bd7f7495.
Also available in: Atom
PDF