Bug #12107

[API] jobs.create should not do slow git operations

Added by Tom Clegg about 1 month ago. Updated 4 days ago.

Status:ResolvedStart date:08/11/2017
Priority:NormalDue date:
Assignee:Tom Clegg% Done:

100%

Category:-
Target version:2017-09-27 Sprint
Story points1.0Remaining (hours)0.00 hour
Velocity based estimate-

Description

It's typical for jobs.create to take ~6 seconds, with the database only accounting for about 75ms.

Our "copy commit to internal.git and tag it" procedure might be responsible for a significant portion of this.

Currently, when creating a new job, source:services/api/app/models/commit.rb does this, whether or not the commit already exists in internal.git:

    must_pipe("echo #{sha1.shellescape}",
              "git --git-dir #{src_gitdir.shellescape} pack-objects -q --revs --stdout",
              "git --git-dir #{dst_gitdir.shellescape} unpack-objects -q")
    must_git(dst_gitdir,
             "tag --force #{tag.shellescape} #{sha1.shellescape}")

Using a copy of an internal.git repo from a production cluster on my workstation, and a commit that is already present in internal.git, I found

git operations time
pack-objects | unpack-objects 4-5s
fetch file:///src_gitdir $sha1 35s including initial auto gc
fetch file:///src_gitdir $sha1 12-14s while auto gc running in background
fetch file:///src_gitdir $sha1 1s after gc done
Proposed improvements:
  • Check whether the commit is already present in internal.git before copying it.
    • [[ `git rev-parse --verify $sha1^{commit}` = $sha1 ]] takes <60ms for both success and fail cases.
  • Use "git fetch file://..." instead of "pack|unpack" to avoid unnecessary work (with the pipe method, pack-objects has to prepare the entire pack whether it's needed or not; fetch is surely smarter).
    • git fetch --no-tags file://$src_gitdir $sha1:refs/tags/$uuid takes ~1s including adding/updating the tag.
  • See if there's a way to do this even more efficiently in cases where the commit isn't yet present. "fetch" takes 1s even when the commit is already present, so perhaps there's another way that finds more shortcuts.

Subtasks

Task #12128: Review 12107-faster-gitResolvedTom Clegg

Associated revisions

Revision 95000b5d
Added by Tom Clegg 4 days ago

Merge branch '12107-faster-git'

closes #12107

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <>

History

#1 Updated by Tom Clegg about 1 month ago

  • Description updated (diff)

#2 Updated by Tom Clegg about 1 month ago

  • Description updated (diff)

#3 Updated by Tom Clegg about 1 month ago

  • Description updated (diff)

#4 Updated by Tom Morris about 1 month ago

  • Target version set to 2017-08-30 Sprint
  • Story points set to 1.0

#5 Updated by Lucas Di Pentima about 1 month ago

  • Assignee set to Lucas Di Pentima

#6 Updated by Lucas Di Pentima 25 days ago

  • Target version changed from 2017-08-30 Sprint to 2017-09-13 Sprint

#7 Updated by Tom Clegg 25 days ago

  • Assignee changed from Lucas Di Pentima to Tom Clegg

#8 Updated by Tom Clegg 17 days ago

12107-faster-git @ 32341a4777692a28358f2af5afa0d9e975e2da73

This makes the first two proposed improvements:
  • If the commit is already present, just tag it.
  • If the commit isn't present, use fetch instead of pack-objects|unpack-objects.

#9 Updated by Tom Clegg 17 days ago

  • Status changed from New to In Progress

#10 Updated by Lucas Di Pentima 16 days ago

Running services/api tests on my Debian 9 VM produced a lot of errors, like this one:

74) Error:
JobTest#test_verify_job_status_[["state",_nil,_[["state",_"Queued"]]]]:
Commit::GitError: git --git-dir /home/lucas/arvados_local/services/api/tmp/internal.git fetch --no-tags file:///home/lucas/arvados_local/services/api/tmp/git/test/zzzzz-s0uqq-382brsig8rp3666/.git 077ba2ad3ea24a929091a9e6ce545c93199b8e57:refs/tags/zzzzz-8i9sb-3tdw50ao2bulisx 2>&1: pid 23753 exit 1: 
    app/models/commit.rb:240:in `must_pipe'
    app/models/commit.rb:228:in `block in must_git'
    app/models/commit.rb:227:in `each'
    app/models/commit.rb:227:in `must_git'
    app/models/commit.rb:161:in `tag_in_internal_repository'
    app/models/job.rb:443:in `tag_version_in_internal_repository'
    test_after_commit (1.1.0) lib/test_after_commit/database_statements.rb:11:in `block in transaction'
    test_after_commit (1.1.0) lib/test_after_commit/database_statements.rb:5:in `transaction'
    test/unit/job_test.rb:234:in `block (2 levels) in <class:JobTest>’

Not sure what's causing this. The git version I'm using is 2.11.0.

#11 Updated by Tom Clegg 13 days ago

Not sure why the same tests didn't fail for me but I have been able to reproduce this with "git fetch".

I found this:

No, out of security concerns; imagine you included some proprietary source code by mistake, and undo the damage by forcing a push with a branch that does not have the incriminating code. Usually you do not control the garbage-collection on the server, yet you still do not want other people to fetch “by SHA-1”.

Silently exiting 1 is rather unfortunate, and most references to this question say the remote "server" decides whether to honor a fetch by sha1.

#12 Updated by Lucas Di Pentima 12 days ago

Ok, I ran the tests on jenkins just to be sure, and they passed without issues. LGTM, thanks!

#13 Updated by Tom Clegg 11 days ago

(continuing from note-11, which I saved mid-investigation)

In any case, experiments confirm we can't rely on fetch, so I've done this:
  1. Check whether commit is already present in internal.git. If so, just tag it and stop.
  2. Ask git for a list of branches that include the desired commit. Take the first one and git-fetch it. Tag the desired commit. If all of these steps work, stop.
  3. Last resort: pack-objects|unpack-objects like we did before.

There are a number of things that can go wrong in step 2 (commit isn't on any branch, regexp isn't good enough to catch the branch name, selected branch gets non-FF pushed just before we fetch and no longer has the commit we want) but in all of these cases, we just try the pack-unpack fallback -- if that doesn't work, nothing would.

12107-faster-git @ bd5a371cc0f57aea20c0954f532db5890dae5c59

#14 Updated by Tom Clegg 10 days ago

  • Target version changed from 2017-09-13 Sprint to 2017-09-27 Sprint

#15 Updated by Tom Clegg 6 days ago

12107-faster-git @ 4e85a19e6746eec88885926b87495e729f631d3a
  • initializes the internal.git repo at test setup time instead of preserving internal.git from one test run to the next (tests were passing for me with broken code, because a previous version's test suite had already fetched all the tested commits into internal.git)
  • adds tests for a "not on any branch" commit and a "not at tip of any branch" commit

https://ci.curoverse.com/job/developer-run-tests/462/

#16 Updated by Lucas Di Pentima 5 days ago

Just one comment:

  • File services/api/app/models/commit.rb
    • Lines 176 & 181: I think those must_git() calls could be condensed into one by using the ensure block, or maybe place it immediately after the begin/rescue block?

The rest LGTM.

#17 Updated by Tom Clegg 4 days ago

Lucas Di Pentima wrote:

  • Lines 176 & 181: I think those must_git() calls could be condensed into one by using the ensure block, or maybe place it immediately after the begin/rescue block?

Added protective comment:

        # Even if all of the above steps succeeded, we might still not
        # have the right commit due to a race, in which case tag_cmd
        # will fail, and we'll need to fall back to pack|unpack. So
        # don't be tempted to condense this tag_cmd and the one in the
        # rescue block into a single attempt.
        must_git(dst_gitdir, tag_cmd)

#18 Updated by Anonymous 4 days ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Applied in changeset arvados|commit:95000b5d6bb4264533a1770346342c8fd3dbe61a.

Also available in: Atom PDF