Bug #14008

new index on containers locked_by_uuid

Added by Joshua Randall almost 2 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
API
Target version:
Start date:
08/20/2018
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-
Release:
Release relationship:
Auto

Description

We have added another custom query to our system to deal with slow container lookups (I think these are coming from crunch-dispatch-slurm, but apologies, I have not kept details of the exact queries or how slow they were).

The new index is:

create index index_containers_on_locked_by_uuid on containers (locked_by_uuid);


Subtasks

Task #14050: Review 14008-containers-indexResolvedPeter Amstutz

Associated revisions

Revision 77f0b1bf
Added by Tom Clegg almost 2 years ago

Merge branch '14008-containers-index'

refs #14008

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <>

History

#1 Updated by Tom Morris almost 2 years ago

  • Target version set to 2018-09-05 Sprint

#2 Updated by Tom Clegg almost 2 years ago

  • Assigned To set to Tom Clegg

#3 Updated by Tom Clegg almost 2 years ago

  • Status changed from New to In Progress

#4 Updated by Peter Amstutz almost 2 years ago

Tom Clegg wrote:

14008-containers-index @ 28cc107d70f7fc6fd05435a49352ac0bcb377657

Trying to understand the logic for having two indexes:

+  def change
+    # For the current code in sdk/go/dispatch:
+    add_index :containers, [:locked_by_uuid, :priority]
+    # For future dispatchers that use filters instead of offset for
+    # more predictable paging:
+    add_index :containers, [:locked_by_uuid, :uuid]
+  end

The current index is ordered by descending priority. Should it also index on 'state' ?

The future index will be... ordered by uuid? And then sorted on the client side? (or stuffed into some sorted data structure). Don't we still care about state and if priority is > 0. Maybe this is a premature optimization if that query hasn't been written yet?

#5 Updated by Tom Clegg almost 2 years ago

Peter Amstutz wrote:

The current index is ordered by descending priority. Should it also index on 'state' ?

A dispatcher doesn't filter on state -- locked_by_uuid≠null implies state∈{Locked,Running} and it wants to get updates in both of those states.

The future index will be... ordered by uuid? And then sorted on the client side? (or stuffed into some sorted data structure). Don't we still care about state and if priority is > 0. Maybe this is a premature optimization if that query hasn't been written yet?

Yes, paging has less weird racy behavior when sorting by uuid (vs. we get page 1; then a container on page 2 gets its priority increased enough to move it to page 1; then we get page 2; ...). Priority sorting can happen on the client side.

A dispatcher doesn't filter on priority>0 -- it needs to know if priority changes to 0 while it has the lock, so it can cancel.

#6 Updated by Peter Amstutz almost 2 years ago

Tom Clegg wrote:

Peter Amstutz wrote:

The current index is ordered by descending priority. Should it also index on 'state' ?

A dispatcher doesn't filter on state -- locked_by_uuid≠null implies state∈{Locked,Running} and it wants to get updates in both of those states.

Ok, the [:locked_by_uuid, :priority] index makes sense for this query, which filters by locked_by_uuid, is ordered by priority descending and paged by offset.

        // Containers I currently own (Locked/Running)
        querySuccess := d.checkForUpdates([][]interface{}{
            {"locked_by_uuid", "=", d.auth.UUID}}, todo)

The future index will be... ordered by uuid? And then sorted on the client side? (or stuffed into some sorted data structure). Don't we still care about state and if priority is > 0. Maybe this is a premature optimization if that query hasn't been written yet?

Yes, paging has less weird racy behavior when sorting by uuid (vs. we get page 1; then a container on page 2 gets its priority increased enough to move it to page 1; then we get page 2; ...). Priority sorting can happen on the client side.

Ok, the [:locked_by_uuid, :uuid] index makes sense.

A dispatcher doesn't filter on priority>0 -- it needs to know if priority changes to 0 while it has the lock, so it can cancel.

There's another dispatcher query that filters by state and priority. I expect we'll need to do something similar in crunch-dispatch-cloud, except paged by uuid instead of offset. Do we want a third index [:state, :priority] ?

        // Containers I should try to dispatch
        querySuccess = d.checkForUpdates([][]interface{}{
            {"state", "=", Queued},
            {"priority", ">", "0"}}, todo) && querySuccess

#7 Updated by Tom Clegg almost 2 years ago

14008-containers-index @ b43cf1d3398b7004ada50e053ae235b814c9aa70
  • adds index on (state, (priority>0))

#8 Updated by Peter Amstutz almost 2 years ago

Tom Clegg wrote:

14008-containers-index @ b43cf1d3398b7004ada50e053ae235b814c9aa70
  • adds index on (state, (priority>0))

Running tests here https://ci.curoverse.com/view/Developer/job/developer-run-tests/865/

#10 Updated by Tom Clegg almost 2 years ago

  • Status changed from In Progress to Resolved

#11 Updated by Tom Morris over 1 year ago

  • Release set to 13

Also available in: Atom PDF