Feature #17698

Parallelize writes inside keepstore when there are writes that request multiple storage classes.

Added by Peter Amstutz 4 months ago. Updated 21 days ago.

Status:
Resolved
Priority:
Normal
Assigned To:
Category:
Keep
Target version:
Start date:
08/13/2021
Due date:
% Done:

100%

Estimated time:
(Total: 0.00 h)
Story points:
-

Description

Customer wants to have two storage classes by default, which means always writing to two locations, so keepstore should be able to write both of them at the same time instead of one after another.


Subtasks

Task #17933: Review 17698-keepstore-concurrent-writesResolvedLucas Di Pentima


Related issues

Related to Arvados Epics - Story #16107: Storage classesNew03/01/202109/30/2021

Associated revisions

Revision a098a3a2
Added by Tom Clegg 27 days ago

Merge branch '17698-keepstore-concurrent-writes'

closes #17698

Arvados-DCO-1.1-Signed-off-by: Tom Clegg <>

History

#1 Updated by Peter Amstutz 4 months ago

#2 Updated by Peter Amstutz 2 months ago

  • Target version set to 2021-08-04 sprint
  • Assigned To set to Tom Clegg

#3 Updated by Peter Amstutz 2 months ago

  • Description updated (diff)

#4 Updated by Peter Amstutz about 2 months ago

  • Target version changed from 2021-08-04 sprint to 2021-08-18 sprint

#5 Updated by Tom Clegg about 1 month ago

  • Status changed from New to In Progress

17698-keepstore-concurrent-writes @ e0632f47bb83bda5badccc47cc2d8dbb70d92678 -- https://ci.arvados.org/view/Developer/job/developer-run-tests/2639/

This adds concurrency in cases where writing to multiple classes involves writing to multiple volumes.

The new test case confirms that we don't over-replicate by starting more writes when the writes already in flight (should they turn out to be successful) will suffice, but it doesn't actually confirm that any writes are done concurrently.

#6 Updated by Peter Amstutz about 1 month ago

  • Target version changed from 2021-08-18 sprint to 2021-09-01 sprint

#7 Updated by Lucas Di Pentima about 1 month ago

Some questions & comments:

  • At file services/keepstore/handlers.go
    • Line 858: PutBlock() documentation comments seem to be a bit stale.
    • Line 840: I think newPutResult() should be named newPutProgress() and var naming consistency be maintained, as it makes the code difficult to follow for me.
    • Line 929: I'm curious as to why the code block doesn't use the mnt var provided by the for-loop. Is that some special go-ism?
    • Line 930: pending.Add() is called without mtx protection, wouldn't that allow more than one goroutine to add the same mnt?
  • At file services/keepstore/volume.go
    • Line 404: Is len(vm.writables) always > 0? If not, I think we'll get a panic on read-only keepstores, correct? If yes, maybe it would be convenient to add some tests for NextWritable()

#8 Updated by Tom Clegg 27 days ago

Lucas Di Pentima wrote:

  • Line 858: PutBlock() documentation comments seem to be a bit stale.

Indeed. Updated.

  • Line 840: I think newPutResult() should be named newPutProgress() and var naming consistency be maintained, as it makes the code difficult to follow for me.

Fixed, thanks.

  • Line 929: I'm curious as to why the code block doesn't use the mnt var provided by the for-loop. Is that some special go-ism?

That ensures each goroutine gets its own mnt var indicating the mnt it's supposed to work on -- otherwise they would all share the for loop's var, changing underneath them.

  • Line 930: pending.Add() is called without mtx protection, wouldn't that allow more than one goroutine to add the same mnt?

mtx is locked there -- and for that whole loop, except while calling cond.Wait(). Added a comment to this effect.

  • At file services/keepstore/volume.go
    • Line 404: Is len(vm.writables) always > 0? If not, I think we'll get a panic on read-only keepstores, correct? If yes, maybe it would be convenient to add some tests for NextWritable()

Turns out handlePUT() doesn't even call PutBlock() if there are no writable volumes, but I added a test and fixed NextWritable anyway.

17698-keepstore-concurrent-writes @ 69cc7c89aa51e7aa7265215bfc910eaa457986d8 -- https://ci.arvados.org/view/Developer/job/developer-run-tests/2646/

#9 Updated by Lucas Di Pentima 27 days ago

Thanks for the clarifications, this LGTM, thanks!

#10 Updated by Tom Clegg 27 days ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF